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(54) Methods and systems for multi-modal browsing and implementation of a conversational 
markup language 



(57) A new application programming language is 
provided which is based on user interaction with any de- 
vice which a user is employing to access any type of 
infomiation. The new language is refened to herein as 
a "Conversational Mart^up Language (CML). In a pre- 
ferred embodiment. CML is a high level XML based lan- 
guage for representing "dialogs" or "conversations" the 
user will have with any given computing device. For ex- 
ample, interaction may comprise, but is not limited to, 
visual based (text and graphical) user interaction and 
speech based user interaction. Such a language allows 
application authors to program applications using inter- 
action-based elements refened to herein as "conversa- 
tional gestures." The present invention also provides for 
various embodiments of a multimodal browser capable 
of supporting the features of CML in accordance with 
various modality specific representations, e.g., HTML 
based graphical user interface (GUI) browser, 
VoiceXML based speech browser, etc. 
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Description 

Field of the Invention 

5 [0001] The present invention generally relates to infonnation access applications and. more particularly, to an inter* 
action-based markup language and multi-modal browsing mechanisms supporting the interaction-based markup lan- 
guage for use with such information access applications. 

Background of the Invention 

10 

[0002] Given the dramatic increase in the availability of various types and quantities of information and a sharp 
decrease in time and/or availability of traditional facilities to access such information, individuals currently desire to be 
able to access, act on, and/or transform any information from any device at any time. In the case of the Internet, for 
instance, large quantities and varieties of information are available, however, traditionally the Internet was mostly sup- 
^5 porting only devices that access information using a HyperText Markup Language (HTML) browser on top of a HyperText 
Transport Protocol (HTTP) network. This was provided on top of TCP/IP (Transmission Control Protocol / Intemet 
Protocol). 

[0003] Solutions to this problem centered around rewriting application programs used to access such information so 
that the infomiation could be accessed in other ways. One solution led to the development of the Wireless Application 

20 Protocol (WAP), see. http://www.mobilewap.com. WAP is equivalent to HTTP for a wireless network. A Wireless Markup 
Language (WML) was developed which is equivalent to HTML for a wireless network. Thus, similar to how HTML Is 
used on top of HTTP. WML is used on top of WAP. WAP and WML allow a user to access the Intemet over a cellular 
phone with constrained screen rendering and limited bandwidth connection capabilities. CHTML is another example 
of a ML (markup language) addressing this space. 

25 [0004] Next, more recently came the development of a mechanism for bringing the Web programming model (also 
known as fat client programming model) to voice access and, in particular, to telephone access and Interactive Voice 
Response (IVR) systems. Such a mechanism is typically known as a speech browser (or voice browser). The speech 
browser may use a speech based variation of the Extensible Markup Language (XML) known as VoiceXML. see. e.g.. 
http://www.volcexml.org. The speech browser can also operate on top of the WAP protocol and In conjunction witii 

30 exchanges of WML data. 

[0005] However, such an approach poses certain problems for application programmers, if they want to offer multi- 
channel support: offer access to web browsers (HTML browsers), phones (voice browsers) and wireless browser (WML) 
or multi-modal/ conversational browsers, as defined in the aforementioned disclosures. First, with this approach, the 
application programmer must deal with at least three different languages when developing an application, e.g.. HTML. 

35 WML and VoiceXML. That is. the application must account for the fact that since a user is going to be accessing Intemet 
based infonmation via a speech browser over a conventional telephone, or over a wireless connection using a W/\P 
browser or using a conventional web browser. HTML. WAP and VoiceXML must be employed when writing the appli- 
cation. This is known to be quite burdensome to the application developer. Secondly, with this approach, there Is no 
suitable way to synchronize multi-modal applications, for example, applications that provide for both visual and speech 

^ based user interaction with the browser or browsers employed to access the application. 

[0006] Applications have traditionally been developed such that both content (i.e.. infonmation or other data) and 
presentation (i.e.. manner in which the content is presented to the user) were mixed. However, in an attempt to simplify 
application programming, an effort was made to separate content from presentation. This led to the development of 
tile Extensible Stylesheet Language (XSL) which operates in conjunction with XML such that content associated with 

45 an application is stored In XML and the transformations necessary to present the content on a specific device are 
handled by XSL. see. http://www.w3.org/Style/XSL. Such approach has been adopted by the W3C (Wortd Wide Web 
Consortium). This approach is typically used to adapt the presentation to the characteristics of the main browsers (e. 
g.. different versions of Microsoft Internet Explorer, Netscape Communicator/Navigator, other less popular browsers, 
etc.). Some have tried to extend this use to other modalities/channels (e.g.. wireless browser supporting a format like 

so WML on top of embedded devices (wireless phone or PDA)). This last approach has never been very successful or 
convenient and in any case It requires multiple authoring of tiie XSL pages. However, this approach has the disadvan- 
tage of being both application and device/channel dependent. That is. XSL rules are dependent on the application and 
device for which the content is to be transcribed. Thus, if an application is to be accessed from a new device, new XSL 
transformations must be written for that device. 

55 [0007] Other attempts to overcome some of these problems have been made. There have been attempts to provide 
an XML model based on user intention (complex and generally task oriented intentions). User intentions may be mod- 
eled with complex components that can not. or are very difHcult to be. rendered on devices with small screens or with 
speech. These complex components, not decomposed Into smaller atomic components, can also not be tightiy syn- 
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chronized across modalities. Tags independent of the device are offered which are rendered by different browsers. 
Also, some extensions to speech Interactive voice response (IVR) systems have been proposed. However, among 
other deficiencies, these attempts do not model dialog and transcoding from modality to modality is generally an im- 
possible task. 

[0008] In these approaches, user intentions are modeled with complex components that describe complex interac- 
tions. However, they are typically application-specific. That is. they depend, characterize, or directly Involve business 
logic concepts and elements. Therefore, in that case, the same way that XSL rules (and XSL style sheets) are today 
fundamentally a function of the application or application domain (i.e.. the nature of the XML attribute involved), the 
XSL rules used to transfomi pages written with theses languages are also fundamentally a function of the application 
or application domain. They must be re-written for each new application. This characterizes the limitation of these 
approaches. These approaches do not contribute in helping to offer access to content, independent of the access 
modality. Indeed, these approaches only allow access to content related to this application or application domain. Any 
other case requires rewriting the transfomnation rules. Thus, there is a need to free transformation rules from the 
backend application and to make it depend only on characteristics/modalities supported by the access device or chan- 
nel. 

[0009] Note that in some cases, support of multiple channels has been achieved by using cascades of stylesheets 
and treating the resulting XML stream as serialized internal APIs (Application Programming Interfaces). Again, this 
requires multiple authoring. 

[001 0] In addition, the above approaches result in having very complex intention models with such components that 
do not have corresponding rendering appropriate In modalities like WML. It is apparent that these models were designed 
to offer the capability to customize the graphical user interface (GUI) presentation to requirements of different types 
of display (i.e.. essentially within variations of the same channel and modality) or browsers. As a result, none of these 
approaches appropriately model and treat speech or multi-modal user interfaces. 

[001 1] As already mentioned, conventional transcoding (XSL rules used to present the XML content and change of 
XSL style sheet to go from one modality to another) has been considered to support different access modalities. This 
means that for a given XML content, by changing the XML mles. the system can produce an HTML page, an WML 
rule, or even a VoiceXML page. etc. Actually, this is what is being used today to support the different web browsers on 
the mari(et. e.g.. Netscape Communicator, Microsoft Intemet Explorer. Sun Microsystems Hot Java. Spyglass browser, 
Open Source Amaya browser/editor, etc. Unfortunately, this is possible only if: 

(i) The XSL rules are application or application domain specific (i.e.. the nature of the XML attribute); and 
(li) Transcoding is between two languages, for example HTML to WML, and the original content has been built in 
HTML while following very strict rules of authoring. Indeed, this is enforceable only if within a given company, for 
a given web site. Even in those cases, it Is hardly imptementable, in general, because of missing infomiation across 
mari(up languages or modalities in order to provide the corresponding components in other modalities (e.g., an 
HTML form or menu does not provide the information required to render It automatically by voice) as well as different 
dialog navigation flows in different modalities. 

[001 2] Accordingly, there is a need for an application programming language and infomiation browsing mechanisms 
associated therewith which overcome these and other shortcomings attributed to existing languages and browsers. 

Summary of the Invention 

[001 3] The present invention provides for a new application programming language which is based on user interaction 
with any device which the user is employing to access any type of information. The new language is referred to herein 
as a "Conversational Markup Language (CML). 

[0014] In a prefen-ed embodiment, CML is a high level XML based language for representing "dialogs* or "conver- 
sations" the user will have with any given computing device. While the terms dialog and conversation are used herein, 
it Is to be appreciated that they more generally refer to a users interaction with a device (e.g., a local device, a remote 
device (e.g., interaction over the telephone), or any otherwise distributed device), independent of the modality and the 
device. Thus, interaction may comprise, but is not limited to. visual based (text and graphical) user interactton and 
speech based user interaction and combinations of them. 

[001 5] Such a language allows application authors to program applications using Interaction-based elements referred 
to herein as "conversational gestures." Conversational gestures are elementary programming components or elements 
of CML that characterize any dialog, independent of the modalities, the devices, or the browsers employed to access 
infomnation associated with an application programmed in accordance therewith. 

[001 6] The invention accomplishes these and other features and advantages by defining a new application prograni- 
ming paradigm. As mentioned above, existing application authoring approaches have adopted the concept of sepa- 
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rating the content based aspects of an application from the presentation based aspects. In accordance with the present 
invention, CML introduces a new paradigm which provides for separating application programming into content aspects, 
presentation aspects and interaction aspects. By focusirtg on the interaction aspect of an application with respect to 
a user, an application may t>e written in a manner which is independent of the content/application logic and presentation. 
It is to be appreciated that the content and/or the business logic of an application is also referred to as the "back-end 
logid" associated with the application. 

[0017] In a client/server arrangement, the "back-end togic" is the portkin of an application that contains the logic, I. 
e.. encoded set of states and conditions that drive the evolution of an application, as well as variable validation infor- 
mation. As will be explained, attribute constraint and validation information can be added to a CML page to carry logic 
infonmation separated from the back-end data. Thus, as will be explained and illustrated below, after an application is 
created in CML. a portion of the CML code associated with the application is downloaded to a client device or devices 
from a server and the CML gestures of the CML code are then transcoded to the browser-specific markup languages 
employed at the device or devices, e.g.. HTML and/or VoiceXML 

[001 8] In accordance with the invention, a device (dient or even server serving CML pages into possibly other legacy 
maricup languages like HTML, VoiceXML. WML etc.) operating with downloaded CML code can transcode to. for ex- 
ample, HTML and VoiceXML, substantially simultaneously so as to synchronize the multiple browsers providing the 
user with access to information. Such advantageous synchronization according to the invention is possible because 
the transcoding Is done gesture by gesture with gesture identification. Thus, when an input/output event occurs in one 
modality, the browser knows what event occun^d for what gesture and can immediately update all the supported 
modalities. This results in a very tight synchronization across modalities. Such synchronization is also achieved due 
to the fact that the various modality-specific user interface dialogues, e.g., associated witii a graphical user interface 
(GUI) browser or a speech browser, are generated from a single CML representation, on a gesture by gesture basis. 
Thus, the multiple user interfaces, e.g., GUI, speech, etc., are synchronized and continuously updated as a user irv 
teractively proceeds with one or the other modality. CML and the browsing mechanisms of the present invention also 
provide a platfonm for natural language (NL) programming. Since CML allows an application author to program gesture 
by gesture, such an application provides the flexibility for a user to provide requests/responses in a wide range of 
natural conversational manners. Thus, the user is not restricted to simple commands but rather can interact with an 
application in a less restrictive manner, e.g., more closely resembling a natural conversation. With NL and the invention, 
the user can express himself freely In multiple nrK>dalities, with no constraint other than to carry a natural conversation 
as If It was canied witii another human being. In tiie case of NL, in addition, the system may use context and past 
interaction/dialog history (as well as other meta-infonmation like user preferences, application settings, stored common 
knowledge, etc.) to disambiguate queries. 

[0019] NL is a statement which is not limited to speech but encompasses all aspects of a natural multi-modal con- 
versational application. It combines NL inputs with natural multi-modal input. As described in the PCT international 
patent application identified as US99/23008 (attorney docket no. Y0998-392) filed on October 1, 1999: any input is 
modeled Independentiy of the modality as an Input/output event that is then processed by a dialog manager and arbi- 
trator tiiat will use history, dialog context and other meta-infbrmation (e.g., user preference, infonmation about the 
device and application) to determine the target of the input event and/or engage a dialog with the user to complete, 
confirm, con'ect or disambiguate he intention of the user prior to executing the requested action. 
[0020] It is also to be appreciated that the present invention provides for a multi-device or distiibuted browsing en- 
vironment. Due to the nature of CML and its ability to the effectively synchronize multiple browsers, various portions 
of an application may reside and be executed on separate computing devices. A user may then simultaneously interact 
with more than one device, e.g., a laptop computer and a cellular phone, when accessing an application. This is referred 
to as "multi-device browsing." Actually, this aspect of the Invention does not require "multi-modality." That is. even with 
only GUI/HTML browsers, the gesture-based XSL rules can be used to define what is rendered on what browser. 
Accordingly, some content can be displayed on a personal digital assistant or PDA (i.e. color images, streamed video, 
long lists), while the rest is displayed on the ceil phone saeen, etc. 

[0021] Given the modality-independence of CML. even after an application is written, any transcoding rules associ- 
ated with any type of browser may be implemented. That is, CML allows the author to change to anottier type of 
ti^nscoding (i.e., the gesture based transcoding rules), other than any default transcoding ttiat may have originally 
been implemented. Thus, ttirough simple updates of gesture based XSL rules, tiiis feature advantageously guarantees 
support for new releases^ersions of the so-called "legacy languages." e.g., HTML, WML, VoiceXML. etc.. and for new 
languages, e.g., CHTML. HDML, etc. In addition, this feature permits a simple and easy passage from one version of 
CML to a new one using simple gesture based XSL rules. It is to be appreciated that gesture by gesture ti^nscoding 
from version to version is not a different problem from ti'anscoding from CML to other legacy language. This is especially 
advantageous as CML is designed, by definition, around the principle of this transcoding. This is certainly not true tor 
most of the other maric-up languages where upgrades of the specifications, while possibly offering backward compat- 
ibilities, are usually problematic for new generation browsers, as well as in with respect to all the older content written 
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in older versions. 

[0022] CML also permits cosmetic altering of a presentation even after a CML page is written. For example, depend- 
ing on the desired nfiodalrty and target markup language, a CML command may be issued to cosmetically alter some 
feature of the presentation of content, in some modalities. This allows CML developers to put the same amount of 

5 cosmetic efforts as they would put for optimal HTML rendering. But the advantage of course is that for the same price, 
they have obtained a multi-channel (i.e., able to be expressed in multiple type of target ML or device modalities or 
specific user interface characteristics) description of the interaction that can be used to provide universal access (in- 
dependent of the access device or channel) and/or tightly synchronized multi-modal and conversational user interfaces. 
[0023] The present Invention also provides for various emtx)diments of a multi- modal browser capable of supporting 

10 the features of CML In accordance with various modality specific representations, e.g.. HTML based graphical user 
Interface (GUI) browser, VoiceXML based speech browser, etc. 

[0024] It is to be noted that the term "CML" is used in the above-referenced patent application identified by attorney 
docket nos. Y0998-392 and in the U.S. provisional patent application identified as U.S. Serial No. 60/128,081 filed on 
April 7, 1999 and the U.S. provisional patent application Identified by Serial No. 60/158,777 filed on October 12, 1999, 
IS both of them being identified by attorney docket no Y0999-178. In these applications, the term is meant to refer to a 
declarative way to describe conversational interfaces. In accordance with the present Invention, the term CML refers 
to a gesture-based language which embodies the concept of programming by interaction, as wilt be explained in detail 
below. 

[0025] Given such aspects of the present invention, as well as others that will be explained below, we now discuss 
20 some important differences between such Inventive features and existing approaches. The exponential growth of the 
Worid Wide Web (WWW) during the last five years has pointed out the Inherent strength In constructing light-weight 
user interface applications by first separating user Interaction firom content, and subsequently delivering application 
front-ends via martcup languages like HTML that are rendered by a platform-independent WWW browser. This archi- 
tecture opens up a new worid of possibilities by liberating end-user applications from details of the underiying hardware 
is and operating system. The current WWW architecture has liberated visual interfaces to e-commerce applications from 
details of the underlying hardware and operating system. The next step in this evolution is to make end-user applications 
Independent of the interface modality and device used to interact witii electronic information. This evolution is tiie 
natural next step in enabling speech-based interaction with the new generation of e-commeroe applications. 
[0026] To achieve end-user WWW services that are device and modality independent, there is a strong need to 
30 author such applications and services using modality independent technologies that enable delivery to a variety of 
devices. With XML fast becoming the next-generation lingua-franca of the WWW, it is natural to design such languages 
as XML applications. 

[0027] Modality-independent WWW services can thus be achieved by designing an XML-based language for au- 
thoring information content and interaction logic that is modality independent, and titen delivering the resulting appli- 

3S catk)n in a manner most appropriate to the target device. This naturally leads to the design of languages that separate 
Infonnatlon content. Information presentation and interaction logic Into distinct components. The WWW has already 
evolved towards separating out content from presentation by adopting style sheets; the next evolutionary step is to 
factor out interaction logic from information content. At present, external standards activities in this area are expected 
to emerge from industial consortia such as the W3C within its XFORMS and voice browser worthing groups. 

<o [0028] The separation outlined above leads to an approach we refer to as conversational computing: end-user ap- 
plications and services are expressed as an aggregation of modality-independent conversational gestures, where each 
conversatk)nal gesture encodes an atomic piece of the man-machine dialogue making up the user interaction. 
[0029] The insights outiined above are validated by the fact that there have been a few attempts at designing intention- 
based maricup language in the recent past They were initially designed to abstract variations In visual presentation 

4S amongst different devices e.g., small screen handhelds versus desktop PCs. As speech interfaces become relevant, 
both these languages are presented as a possible means for authoring end-user applications for delivery to speech 
devices, in addition to ti^e different visual displays that were their original target 

[0030] CML. according to the present invention, is designed from tiie ground-up as an XML-based language for 
authoring modality-independent user interaction, with a special focus on the new requirements inti-oduced by the need 
so to address conversational interfaces comprising of speech and natural language technologies. This focus on speech 
as a first-dass citizen in the user Interface has caused CML to evolve in a manner distinct from previous attempts. We 
will contiBSts some of these key differences. 

(i) Overiays Interaction On Data Model 

55 

[0031] All prior arf languages define the user intentions and the underiying data model that is populated by the user 
interaction within the same piece of maricup. Here is a short example from a specification to illustrate this. The fragment 
of maricup shown below would be used to obtain a person's titie (Mr., Mrs., or Ms.). Notice ttiat the definition of the 
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datatype being prompted for Is Intermixed with the markup that produces the user interaction. 



<prior art ML> 

<CHOICE NAME="PersonTitles'* 
SELECTION.POLICY="SINGLE"> 

<CAPT10N>Title</CAPT10N> 

<HINT>This is a set of valid titles for a person.</HINT> 

<STRING NAME="Mr-> 

<VALUE>lVlr.<A^ALUE> 
</STRING> 

<STRING NAME="MRS*> 

<VALUE>Mrs.<A^ALUE> 
</STRING> 

<STRING NAME="MISS'> 

<VALUE>Miss</VALUE> 
</STRING> 

<STRING NAME="MS"> 

<VALUE>Ms<A^ALUE> 
</STRING> 
</CHOICE> 

</prior art ML> 

Compare the above with the CML representation for obtaining the person's title shown below. Notice that we separate 
the definition of the datatype, i.e., the enumeration type that lists valid person titles, from the user interaction component, 
i.e.. namely, the select gesture. 
[0032] We first define the enumeration type PersonTitle: 



<cnuin name="PersonTitle" type="string*'> <value>MR</value> <value>MRS</value> 
<value>MISS</value> </cnum> 



Once defined, field PersonTitle can t>e instantiated at multiple points in the user interaction via an appropriate CML 
gesture. Below we illustrate this with gesture select. 



<select nanie="PersonTitle" selection-policy="single"> <message>Person Title</mesage> 
<choices> <choicevalue="MR">Mr.</choice> <choicevalue="MRS">MRS.</choice> 
<choicevalue="MlSS">Miss.</choice> </choices> </select> 



Separating the conversational gesture (gesture select in the above example) from the definition of the underlying da- 
tatype (enumeration PersonTitle above) provides a number of advantages: 

(1 ) By separating the conversational gesture from the data definition, we can author multiple user interfaces for 
prompting for the person titie, e.g., when internationalizing the above dialogue. Thus, a German version of this 
dialogue constructed in CML would require only the conversational gesture to be modified. Notice that when the 
representation of the above is intemationalized. i.e., what needs to change are the contents of elements caption, 
hint and code value, the definition of the underlying enumeration type remains the same. However, by overlaying 
the user interface maricup on the data definition, this design fails to isolate the changes needed to Intemationalize 
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the dialogue. Note that some of the previous languages work around this explicit problem of internationalization 
by introducing the notion of templates which then get re-used by the author for producing different language ver- 
sions of the above dialogue. However, this does not eliminate the basic underlying problem, i.e.. the data definition 
and user interface still remain linked in the template definition. 

5 (2) Once field PersonTitie is defined. CML gestures can refer to this field at multiple points in the user interaction. 

Thus, once the user has specified a value for field PersonTitie. subsequent portions of the diak)gue can refer to 
the supplied value when producing prompts e.g.. Welcome to the electronic store Mr. Smith. 
(3) Applications authored in CML are also free to prompt the user for a specific field such as PersonTitie at different 
points in the user interaction, with the user having the freedom to decide at which point he/she supplies a value 

10 for that field. This form of flexibility is especially vital in designing natural language interfaces, and is again a 

consequence of separating the markup that defines the model from the markup that declares the user interaction. 
Without this separation (as in the prior art at present), the above would force the author to define field PersonTitie 
multiple times. 

IS [0033] To see the above, consider a mutual funds application that allows the user to buy and sell mutual funds as 
well as to find out the net value of a specific asset in a simplified version of this interaction, the system needs to obtain 
two items of Information from the user 

(a) User action, e.g., buy sell or net asset value; 
20 (b) Asset to act on, e.g.. fiind to buy. 

[0034] When using a natural language interface for the above example, the user is equally likely to specify either the 
action to perform, the asset to act on, or perhaps both when initially prompted by the system. Depending on what is 
specified, the dialogue now needs to transition to a state where the system prompts for the missing piece of information; 

25 altematively, if both action and asset are specified, the system needs to produce a confimiation prompt of the form: 
"^ould you like to action specified fund?" Given that the prior art cunrently overlays the interaction markup, i.e., in this 
case, element CHOICE on the data definition, it becomes Impossible for the application author to specify the user 
interaction for obtaining value the same field, e.g., asset at different points in the user interactk>n. 
[0035] The overlay of interaction over data models especially emphasizes the novelty of our approach and new 

30 paradigm and programming model that we disctose herein. 

(ii) Lack Of Explicit Environment For Encapsulating Application State 

[0036] A further consequence of separating out the data model from the user interaction in CML is that applications 
35 authored as a CML document clearly present an environment that binds application state, e.g., PersonTitie or action 
in the examples cited above. In the case of tiie prior art, this application state is Implicit and not readily available to 
other parts of the user interface encoded in the language. 

[0037] By defining the data model and hence tiie application state explicitly. CML cleariy defines the XML encoding 
that will be sent back to ttie server once user interaction is completed. Thus, in the case of fiekl PersonTitie, the server 
40 would receive the following upon submission: 

<PersonTltie>MR</PersonTrtle> 

[0038] The server, which has access to ttie definition of the data model, can validate the submitted value. In more 
45 complex examples, the data model definition can encapsulate application-specific validation constraints; tiiese con- 
straints can be checked t>oth at the client-side, and later verified upon submission on the server end. This separation 
of the data model and constraints from the user interface enables CML applications that allow the user to commence 
an interaction using a particular interaction device, e.g.. a desktop PC, submit a partially completed transaction, and 
later complete the transaction using a different device, e.g., a cellular telephone. 

so 

(ill) The prior art reflects GUI Legacy 

[0039] Many of the core attributes defined in the prior art specification reflects GUI-specific legacy. For instance, all 
data types are qualified by core attributes shown, which makes sense only for display-based interfaces. There appears 
55 to be no unambiguous interpretation of settings such as enable=fatse, shown=1rue for non-visual devices such as 
speech-based handholds and cellular telephones. 

[0040] Moreover, these attributes make it hard to map representations of user Interaction to small-sized displays; 
this is because an applicatton authored in ttiese ML for a desktop GUI is likely to declare that many of ttie interaction 



7 




EP1 100 013 A2 

elements be shown, something that becomes difficult in environments where display real-estate is scarce. 
[0041] The prior art usually has other GUI components that have no meaning outside large screens. Unfortunately, 
features that are pervasive in the language and not easily usable across modalities/channel are problematic: one can 
not guarantee that transcoding/rendering will be possible for any target 
5 [0042] In addition modalities like speech may require additional information in order to render the dialog components 
(e.g.. grammar, vocabulary, language model, acoustic model. NL parsing and tagging data files, etc.). This information 
is not available in the prior art widgets. Again, the overlay between data model and interaction leads to problems when 
ttie same dialog component needs to be used multiple times in the page with diffmnt data files. 

10 (iv) Lack Of Atomic Conversational Gestures 

[0043] Because prior art representations of user interaction are overtaid directiy on ttie undertying data model that 
is being populated, there is no notion of a set of atomic conversattonal gestures in these MLs as in CML; rather, explicit 
CML gestures such as select are implicit in the prior art design. For example. CML gesture select would appear In prior 
15 art as a result of overtaying the martcup for a choice element on the markup for a list structure, see the example of field 
PersonTitie cited above. 

[0044] Lack of atomic conversational gestures first becomes a problem when constructing more complex dialogues; 
for instance, the prior art introduces explicit table and tree constructs to parallel ttie GUI notion of two-dimensional 
tabular layout and tree widgets. But since these higher-level constructs are not built up of atomic building blocks as in 
20 CML. mapping components constructs like table or tree (where tree is declared to be open or closed) to modalities like 
speech tiiat lack a static two-dimensional display is impossible. Also, gestures like tree and table have no immediate 
equivalent on small screen devices. 

(v) Synchronization 

25 

[0045] Tight synchronization across multiple interaction modalities is a key requirement of high-quality multi-modal 
interfaces, (^oing forward, such multi-modal clients are more likely to be constructed using the DOM (Document Object 
Model as described at http://www.w3c.org) provided by conventional browsers as the undertying platform. In tiiis latter 
implementation scenario, the overtaying of the user interface constructs on the data definition detailed above is likely 
30 to once again become a stumbling block (e.g.. same problem, now view by view as mentioned above for the lack of 
explicit environment to encapsulate the dialog/application state). 

[0046] Tight synchronization across modalities is a basic design goal in CML. This is reflected throughout the CML 
design, and the resulting separation between conversational gestures and the definition of the data model makes it 
easier to implement a multi*modal browser that Is constructed on top of tiie DOM using the classic Model View Controller 
35 (MVC) design. 

(vi) Conversational Applications 

[0047] Conversational applications can be developed dectaratively by activating simultaneously multiple forms (each 
^ describing a transaction or portion of ti^nsaction). This requires the capability to re-use at different places in the file 
tile same dialog component. As explained above, the overtay mentioned eariier does not support this requirement 

(vii) Lack of event binding 

45 [0048] The lack of event binding capability limits the multi-channel/multi-modal capabilities of ttie application: tiiere 
is no way to associate some specific logic action to some specific physical action. This is especially critical if we want 
to offer a multi-modal/multi-channel access where different bindings are desirable (e.g., a key short cut for telephony 
help, a voice command for help and a key combination on the keyboard for help). 

so (viii) Peer 

[0049] Furtiier. prior art attempts also rely on the technique of peers for generating different user interfaces from the 
same undertying representation; by doing so. it does not address the problems of synchronized multi-modal interaction. 
[0050] These and other objects, features and advantages of the present invention will become apparent from the 
55 following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompa- 
nying drawings. 
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Brief Description of the Drawings 
[0051] 

FIG. 1 is a diagram illustrating the conventionat application programming approach; 

FIG. 2 is a diagram illustrating the Interaction based application programming approach of the present invention; 
FIG. 3 is a diagram illustrating an example of a CML authored application according to an embodiment of the 
present Invention; 

FIG. 4 is a diagram illustrating the XFORMS concept; 

FIGs. 5A and 5B are diagrams illustrating the use of XFORMS in the interaction based programming approach of 
the present invention; 

FIGs. 6A through 6C are diagrams illustrating a GUI welcome page, transfomned from CML source code page, as 
viewed with an HTML browser. 

FIG. 7 is a diagram illustrating a GUI welcome page, transformed from CML source code page, as viewed with a 
WML browser 

FIG. 8 is a diagram illustrating a GUI welcome page, transformed from an HTML cosmetized CML source code 
page, as viewed with an HTML browser; 

FIG. 9 is a diagram illustrating a new interpretation of the MVC model; 

FIGs. 10-12 illustrate the migration road map from existing systems to full use of CML according to the present 

invention; 

FIG. 13 is a diagram illustrating a multimodal browser architecture according to an embodiment of the present 
invention; 

FIG. 14 is a flow diagram Illustrating an exemplary usage of CML in the application programming process according 
to an embodiment of a multimodal browser mechanism of the present invention; 

FIG. 15 is another flow diagram illustrating an exemplary usage of CML in the application programming process 

according to an embodiment of a multimodal browser mechanism of the present invention; and 

FIG. 16 is a diagram illustrating a multidevice browser architecture according to an embodiment of the present 

invention. 

Detailed Description of PrefBrred Embodiments 

[0052] The following description will illustrate the invention using a preferred specification of CML, a prefen-ed multi- 
modal browsing environment, and some exemplary applications for a better understanding of the invention. It should 
be understood, however, that the invention is not limited to these particular preferred implementations and exemplary 
applications. The invention is instead more generally applicable to any infonnation access application regardless of 
the access protocol, modality, browser or device. Thus, the invention is more generally applicable to any information 
access situation in which it is desirable to provide synchronized, multi-modal, easy and convenient access of Information 
to a user. 

[0053] The detailed description is divided into the following sections for ease of reference: (I) CML Specification; and 
(II) Multimodal Browser Architecture to support, parse and render CML. Section I provides a detailed description of a 
preferred specification of CML according to the invention. Section II provides a detailed description of a preferred 
multimodal browsing environment implementing CML according to the invention. 

LCIMLSPECIFiCATION 

[0054] The following description is a specification of a preferred embodiment of CML. This section is divided into the 
following subsections for ease of reference: (A) Introduction; (B) Comparative Examples; (C) CML Syntax; (D) Name- 
spaces; (E) CML Attributes; (F) CML Components; (G) Binding Events; (H) Grouping Gestures and Defining Focus; 
(I) Data Model and Data Types; (J) Accessing Environment; (K) CML Traversal Model; (L) Transfomning CML to Specific 
User Interface Languages; (M) Cosmetization; and (N) CML Document Type Definition. 

A. Introduction 

[0055] As mentioned above, separating content from presentation in order to achieve content re-use Is the conven- 
tionally accepted way of deploying infonnation on the World Wide Web (WWW). This Is illustrated in FIG. 1 . As shown, 
the existing approach with respect to application authoring is to consider only two components: a content component 
(A) and a presentation component (B). In the cunrent W3C architecture, such separation is achieved by representing 
content in XML that is then transfomied to appropriate final-form presentations (e.g., HTML. VoiceXML. WML) via 
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application and device dependent XSL transforms. However, critical disadvantages exist with this approach. Indeed, 
the XSL rules typically depend on the backend application or domain. As a result* authoring of an application is a 
multiple authoring exercise with design of the Xl^L content and then design of XSL style sheet per application/page 
and per target device/channel. In addition, when style sheets are expected to be used to transcode from one ML to 

5 another, as previously mentioned, transcoding is typically often between two legacy languages (e.g.. HTML to WML), 
and ttte original content has been buitt in HTML while following very strict rules of auttioring. Indeed, this is enforceable 
only if within a given company, for a given web site. Even in those cases, it is hardly Implementable, in general, because 
of missing information across markup languages or modalities in order to provide the corresponding components in 
other modalities (e.g., an HTML form or menu does not provide the infomnation required to render it automatically by 

10 voice). 

[0056] CML is motivated by the realization that in addition to form (presentation) and content, there is a third com- 
ponent, i.e.. interadk)n, that lies at the heart of tuming static information presentations into interactive information. It 
is to be appreciated that statk: information is a very particular case where the user Is passive and presented with all 
the infonnation. This new paradigm is illustrated in FIG. 2. As shown, the present invention introduces the concept of 
15 programming by interaction wherein application authoring is broken into three components: content (A); presentation 
(B) and interaction (C). This new programming paradigm goes in pair with the development of a new programming 
environment, e.g., development tools, etc. 

[0057] We refer to such "light-weighf information applications, or electronic information with small amounts embed- 
ded application intelligence, as "infoware" tiiroughout this specification. Until now. such interaction has been repre- 

20 sented partiy within the presentational HTML. e.g.. form elements, and partly within server-side logic encapsulated in 
servtets and CGI (Common Gate Interface) scripts. This combination has resulted in the creatton of infoware or light- 
weight applicattons where the information content dominates. Good examples of infoware on today's WWW include 
- e-businesses like Amazon.com. 
[0058] As we move to a worid where we interact witii such infoware via multiple modalities, it is now time to achieve 

25 a clear separation between these three aspects of electi-onic content, namely, content, presentation, and interaction. 
[0059] CML is based on the insight tiiat all man-machine dialog can be broken down to an appropriate sequence of 
"conversational gestures" or modality-independent building blocks (components or elements) that can be appropriately 
combined to replace any interactbn. CML encapsulates mannmachine interaction in a modality-independent manner 
by encoding these basic building blocks in XML. Such CML encapsulations are later transfomned to appropriate mo- 

30 dality-specific user interfaces. This transformation is performed In a manner that achieves synchronization across 
multiple "controllers." i.e.. browsers in today's WWW-centric wortd. as they manipulate modality-specific "views" of a 
single modality-independent "model." The terms "model," "view" and "conti-oller," are well-known terms used in accord- 
ance with the classic MVC (model-view-controller) decomposition of computing, see. e.g.. G.E. Krasner and S.T. Pope. 
"A Cookbook for Using the Model- View-Controller User Interface Paradigm in SmallTalk-80," Joumal of Object-Oriented 

35 Programming, 1(3):26-49, August/September 1988. the disctosure of which is incorporated herein by reference. The 
result Is uniform conversational behavior aaoss a multiplicity of infonmation appliances and coordinated, well-synchro- 
nized user interactton across a multiplicity of interface modalities. 

B. Comparative Examples 

40 

[0060] Before providing a description of the spedfication of the CML preferred embodiment, we present some ex- 
amples to illustiBte fundamental principles of CML and programming by interaction. The examples refer to a "global 
cafe" site. Imagine a cafe that has decided to offer to its customers the possibility to pre-order their drinks prior to 
aniving the cafe or when in the cafe. As such, they fondamentally want to offer access to their information Independentiy 
<5 of the access channel. 

[0061] Accordingly, a page is autiiored in CML. The CML code for generating tiiis page is illustrated in FIG. 3 as 
CML code 10. The page fundamentally comprises a sequence of conversational gestures (note that the gestures here 
are taking some freedom from details of the actual CML specification, to be provided below, for the sake of providing 
a better understanding of the fundamental principles of CML and programming by interaction). The page may comprise 
50 the following: 

(1) Tltte (shown as "gesture" 20 in FIG. 3): " Global Cafe" (I.e., a particular message to be rendered as a Title) 

(2) A gesture message (shown as "gesture" 22 in FIG. 3): Would you like coffee, tea, milk or notiiing? 

(3) A gesture exclusive select out of a list (shown as "gesture" 24 in FIG. 3): the list is composed of tiie following 
55 items: coffee, tea, milk and nothing. 

(4) A submit gesture (not expressly shown in FIG. 3). 

Cleariy. the page fully defines the complete interaction wrth the user without introducing any dependency on the target 
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modality (i.e.. type of access channel or access device). It also clearly illustrates the programming model of progran>- 
ming by interaction: 

(i) The application is programmed by interaction: using elementary components of interaction, independently of 
5 the target modality: 

(a) A gesture message: "Global Cafe." 

(b) A gesture message:Would you like coffee, tea. milk or nothing? 

(c) A gesture exclusive select out of a list 
10 (d) A submit gesture. 

(ii) This is connected to the backend which is programmed/developed conventionally. In this example, the connec- 
tion to the backend is illustrated by the list (coffee, tea, milk and nothing) that has been read in the backend 
database and added as argument to the list, either statically, when the page has been produced, or dynamically, 

IS when the pages have been dynamically generated on the server using backend logic. 

(iii) At this stage, if needed, constraints, validations of the attributes/variables can be added, for example using 
the XFORM syntax. For example, if the page asks for the age of the user in order to offer alcoholic beverages, a 
constraint can easily be expressed that restricts or modifies the dialog if the users indicates that he is under age. 
This is not explicitly shown on the page. 

20 (iv) The presentation can thereafter be cosmetized. In the present example, it Is done by using the gesture title 
instead of a gesture message: modality independent cosmetization. Modality specific cosmetization can also be 
added, for example by adding HTML tags that specify the background (color or image) to use for the resulting 
HTML page. This will be ignored by the other target modalities or replaced by a "behavior* provided for the other 
modalities. For example, when an image is displayed in the HTML modality, a caption may be provided to be a 

25 rendering instead for the WML, VoiceXML. or other modalities. 

(v) The resulting pages can now be rendered by appropriate browsers. Two models exist Either CML pages are 
served to browsers that can parse and render CML content (see Case B below) or they are served to legacy 
browsers that can only handle legacy languages, e.g., HTML. WML. VDiceXML, etc. (see Case A below). 

30 (a) Case A: This case is also known as the "multi-channer case. The target browser is well-defined (identified 

at the HTTP connection for HTML browser), because of the address of the requester (wireless gateway or 
speech browser) or because of the request (i.e., HTML file request versus WML page request). When a page 
is requested, it is fetched in CML and transcoded on the fly using the gesture-based XSL transformation mies 
into the target ML. 

35 (b) Case B: The target browser handles CML Therefore, it knows exactly what are the modalities that It sup- 

ports (single or multiple) as well as the rules required to optimally render a given gesture in its. supported 
modalities. These gesture XSL transfomiation rules are advantageously something that has been programmed 
In the browser when the device was built or when the browser was ported to it Clearly, it means that the most 
appropriate programmer with the appropriate infbmnation (I.e., knowing fully well the device) takes that re- 

^ sponsibility. 

(vl) In both cases, the CML application developer does not need to do anything. He/she can always assume that 
the platform/browser will appropriately handle the rendering. 

(vii) The gestures are completely Independent of the target modality. They depend also only on the gesture not on 
45 the backend business logic/domain or anything else. This is why the XSL rules can be stored on the browser. 

(viii) The XSL rules render the gestures based on the target modality. In the present case this means: 

(a) Title: 

50 - HTML Bold. Header character displayed 

WML: Single card display 
- VoiceXML: Weteoming prompt 



55 



(b) Message: 

HTML: display in regular characters 

• WML: display in regular character (possibly on multple cards) 

• VoiceXML: Generate a prompt message (text-to-speech or play back) 
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(c) Exclusive selection out of list 

HTML: Pull Down Menu 
. WML Radio buttons 

- VoiceXML: Dialog (possibly Natural Language) to select in the menu (e.g., "You have that many items to 
select from. I will read the first three. Please select an item or say more for the next 3..."). 

Returning to FIG. 3, a visualization of the three example renderings that may be obtained in the global cafe 
application written in CML. Thus, from the CML code 10 comprising the gesture-based XSL transformations, an 
HTML rendering 1 2, a WML rendering 14, and a VoiceXML rendering 16 of the global cafe application are obtained, 
(ix) When the transcoding is performed by a multKnodal/conversational browser (as described below), the gestures 
are uniquely identified using a node.id tag. This albws not only to produce the rendering in each registered modality 
(local or distributed), but also to provide very tight synchronization (i-e., on a gesture level or even sub-gestures 
levels, when it Is a gesture for which this makes sense). For example, an event (I/O event) immediately impacts 
the state of the dialogs (I.e., the state as maintained in the multi-modal shell, for example, as in the above-refer- 
enced patent application identified by attomey docket no. Y0999-178) and the other modalities. Thus, such tight 
synchronization may exist between the HTML rendering 12 as may be supported by a personal digital assistant 
and the VoiceXML rendering 16 as may be supported by a conventional telephone. 

[0062] Note that the gestures XSL transfomnatton rules can be overwritten by the application developer indicating 
where they should be downloaded. They can also be ovenwritten by user, application or device preference from what 
would be otherwise the default behavk)r. 

[0063] New gestures can also be added, in which case, the associated XSL rules must be provkJed (e.g., a URL 
where to get them). 

C. CML Syntax 

[0064] In a preferred embodiment of CML. CML syntax is XML compliant CML Instances are well-formed XML. CML 
processors may be implemented as validating XML processors based on device consfratnts. 

(i) Special CML notes 

(1) Case Sensitivity 

[0065] CML clients and servers treat CML element and attribute names as being case sensitive. As a convention, 
all element and attribute names defined in this specification use lower-case. This convention is strictly imposed on all 
predefined element and attribute names. 

(2) Content Model 

[0066] A CML instance consists of a sequence of XML elements. CML does not allow any pc data at top-level, i.e., 
all top-level children of a CML are necessarily elements. 

(3) Sparse CMLs 

[0067] CML instances may be sparse; except attribute node.id. the top-level CML attributes and elements docu- 
mented in this specification are required. 

(4) Entity References 

[0068] All entity references In CML conform to the URI (Universal Resource Identifier) specification, see URI spec- 
ification from the W3C at http://Www.w3.org. 

(ii) Terminology 

[0069] The temiinology used to descrit>e CML documents is defined in the body of this specification. The terms 
defined in the following list are used in building those definitions and describing the actions of a CML "processor." A 
CML processor generally refers to a processing device configured to execute CML code and associated applications. 
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The terms are: 

may - Conforming CML documents and processors are permitted to but need not behave as described, 
must - Conforming CML documents and processors are required to behave as described; otherwise they are in 
5 eror, as defined below. 

error - A violation of the rules of this specification; results are undefined. Conforming software may detect and 
report an error and may recover from it 

fatal error - An error which a conforming CML processor must detect and report to the application. 
10 D. Namespaces 

[0070] This section details the use of namespaces within all sections of a CML instance. Note that all elements and 

attributes defined in this specification are implicitiy in namespace cml. i.e., element name message in a CML instance 

occuning within a general XML document is visible to the processing application as cml:message; CML attribute 
15 node.id would be visible to ttie XML processor as cmlrnode.id. The subsequent paragraphs in this section define the 

rules for how namespace cml is further subdivided to avoid name collisions amongst CML clients. 

[0071] All namespaces introduced by "unqualified" namespaces, e.g.. vxml are implicitly in namespace com.ibm. 

cml.vxml. More generally, vendor specific namespaces use a vendor prefix that is constructed from the vendor's domain 

name ~ this is analogous to the scheme used by systems like Java. 
20 [0072] CML also uses namespaces to allow field names and values fi-om different pieces of infoware to coexist Thus, 

the fully qualified name of field drink in application cafe is cafe.drink. An example applteation will be given bek>w for 

this drink example. Note that all field names in CML are always fully qualified, i.e.. there is no implicit hierarchy within 

field names based on the nesting level at which an associated gesture occurs. 

25 E. CML Attributes 

[0073] CML instances can have the following XML attributes. Unless stated otherwise, all attributes are optional. 

(i) node.id - Unique identifier for this CML node. Attribute node.id Is required. 
30 (ii) title - Human-readable metadata string specifying a title for the CML instance. 

(iii) name - Name used to establish a namespace for all field values instantiated within the CML Instance. This 
attribute is required for CML instances that are intended to be reusable. 

(iv) action - Specifies the URL (Uniform Resource Locator) that is the target action of the CML instance. 

(v) style - URI of associated XSL style sheet. Unless specified, the CML interpreter defaults to a generic style sheet 
35 for transforming the modality-independent CML instance into modality-specific encodings. Attribute style allows 

CML creators to ovenide or specialize system-wide style rules. 

F. CML Components 

^ [0074] A CML instance represents a "conversational gesture." As previously mentioned, a conversational gesture is 
a basic building block of a dialog and encapsulates the interaction logic in a modality independent manner. Complex 
conversational components (also referred to as dialog components or dialog modules) are constiucted by aggregating 
more basic conversational gestures described in detail in subsequent subsections. These complex conversational 
components are usually task oriented, e.g.. get a phone number, get an address, etc. CML descriptions of basic con- 

^ versational gestures can nest to the desired level of complexity. Besides nesting, complex conversational components 
can be obtained by combining the basic conversational gestures in parallel and/or in sequence. Also, complex con- 
versational components can be achieved by combining Imperative gestures, e.g., Conversational Foundation Classes 
(CFCs). as will be explained below. Note also that though every CML gesture is an XML element, the converse is not 
true, i.e.. every XML element defined in tiiis specification is not a CML gesture. Many CML gestures use sub-elements 

50 to encapsulate substiuctijre of a given gesture. In the subsequent sections. CML elements that are "gestures" are 
mariced as such in the subsection entitled Gesture Message. 

[0075] CML is designed to inter-operate with other emerging W3C standards such as. for example. XHTML (Exten- 
sible HyperText Markup Language). CML elements therefore re-use, rather than reinvent, elements ft-om other mari^up 
languages like HTML, MATH ML, etc.. where appropriate. Such elements, when embedded In a CML instance, are fully 
55 qualified, e.g., html : em. The first subsection below introduces the common aspects of the various CML building blocks; 
subsequent subsections describe each building block in detail. Notice that each CML primitive captures a basic con- 
versational gesture; XML attributes are used to encode more specialized behaviors. Thus, for example, asking a yes 
or no question Is a CML primitive; a yes or no question requiring user confimnation is a refinement of ttiis primitive. 
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[0076] It Is to be appreciated that because CML allows overwriting gestures and extending gestures, it does not 
matter what Is the basic set of CML gestures that Is provided in a particular embodiments of CML. The set and rules 
provided herein allow for Implementation of any legacy page and interaction. 
[0077] CML gestures share the following common XML attributes: 

action - Action to be perfbmned upon completion of the gesture. Attribute action can be one of link, return or submit, 
(i) Gesture Message 

[0078] The conversational gesture message Is used to convey infonnational messages to the user. The gesture 
message Is typically rendered as a displayed string or a spoken prompt Portions of the message to be spoken can be 
a function of the current state of the various pieces of infoware being hosted by the CML interpreter (see section on 
accessing environment state). 

Example: 

[0079] 



<message node_id="l"> 

Your <html:em> checking</htinl : em> account balance is 
<value name="banking. checking*balance"/> 
after transfering 

<value name="banking. checking. transfer"/> 
to your 

<value name-"banking. creditCard. account ' •/> 
</inessage> 



Empty element value is used to splice in variable information from the current environment and is defined formally in 
the section on accessing environment state. 

(ii) Gesture Help 

[0080] The conversational gesture help is used to encapsulate contextual help to be displayed if the dialog runs into 
trouble. The gesture help is typically rendered as a displayed string or a spoken prompt. Portions of the message can 
be a function of the cunent state of the various pieces of infoware being hosted by the CML interpreter. 

Example: 

[0081] 

<help node_id="l"> 

You can check your account balances by specifying a 
particular account. </help> 



(iii) Final 

[0082] CML element final is used within gestures to encapsulate actions to be taken upon successful completion of 
the encapsulated gesture, e.g., updating the enclosing environment based on user interaction. 

(iv) Gesture Boolean: Yes Or No Questions 

[0083] The conversational gesture boolean encapsulates typical yes or no questions. The gesture boolean encap- 
sulates the prompt to be used as a message, as well as the default response, if any. Attributes require_confimiation, 
require. confimriation_if_no and require_confirmation_i(_yes (all false by default) allow infoware applications to refine 
the dialog. 



14 



EP1 100 013 A2 



Example: 
[0084] 

<boolean def ault="y" 
nocie_id=''l'* 

require_conf inn_if_no="true"> 

<grainmar type="text/ jsgf "> 

(yes I yeah) {yes} I (no nay) (no) 

</grammar> 
<inessage> 

Please confirm that you would like to stay at the 
<value href =" travelCenter . hotel . selected" /> 
</message> 
</boolean> 

(v) Gesture Select 

[0085] The conversational gesture select is used to encapsulate dialogues where the user is expected to pick from 
a set of choices. It encapsulates the prompt, the default selection, as well as the set of legal choices. Attributes of 
element select refine the gesture to achieve mutually exclusive select (visually rendered as a group of radio buttoris), 
select from range, visually rendered as a scrollbar, etc. Sut>>elements of select include: 
choices 

Contains the list of possible choices • embedded either by value or by reference. Element choices contains a list 
of one or more choice elements as well as. at most, one default element that specifies the default selection, if any. 
predicate 

Predicate encapsulating the test that the selection should satisfy. 

help 

Help to be offered in case the dialog gets stuck. 

error 

Contains a message to be used if the predicate fails. 
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Example: 
[0086] 

<select name="portfolio.fund- 
node_ici="l" 

require predicate=''yes"> 
<message node_id-"2''> 

Which of your positions would you like to check? 

</message> 

<help> 

You can specify the names of stocks or funds you own 

and we will report your current position. 

</help> 

<choices> 

<var name="possibleChoices"/> 

<default value="possibleChoices-> 

Check the position of all holdings</def ault> 

</choices> 

<predicate> 

<condition> 

fund in possibleChoices 

</condition> 

<error> 

Sorry, you do not appear to own any shares in 

<var name="portfolio.fund"/> 

</error> 

</predicate> 

</select> 



{y\) Predicate 

[0087] The element predicate is used In CML to encapsulate rules for validating the results of a particular conver- 
sational gesture. Test predicates are expressed as simple conditionals using the expression syntax and semantics 
defined In the xpath specification from the W3C, i.e., XML Path Language, W3C Proposed Recommendation, the 

disclosure of which is incorporated by reference herein, see http://www.w3.org/tr/xpath. Xpath specifies an expression 
syntax for accessing different portions of the document tree; validations that require calls to an application backend 
are handled separately. 

[0088] Conversational gestures that include a predicate element qualify the action to be taken in case of a failed test 
via appropriate attributes. 

(vil) Grammar 

[0089] The CML sub-element grammar is modeled after element grammar in VolceXML. Sub-element grammar en- 
codes the grammar fragment; sut)-etement help encapsulates an appropriate help message to be played to the user 
to indicate what utterances are allowed. Where appropriate. CML gestures can provide grammar fragments that are 
assembled into more complex grammars by the CML interpreter. 

[0090] The sub-elements grammar can be generalized as rules to process input, in particular, speech. These rules 
can be strict or can describe remote resources to t>e used for processing (URL), and provide arguments to pass to 
these resources that characterize what processing must be perfomned with what data file and how the result must be 
retumed and to what address. In general, the grammar may be defined in line or defined via a URL, 
[0091] In addition, it is also possible to declare this processing through an object tag, e.g., <object> . . . <object>. 
An object tag allows for loading Conversational Foundation Classes (CFCs) or Conversational Application Platform 
(CAP) services (see. e.g., the PCT international patent application identified as US99/22927 (attomey docket no. 
Y0999-1 1 1 ). filed on October 1 , 1 999 wherein CAP is equivalent to CVM or Conversational Virtual Machine). Arguments 
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can be passed to the object using XML attributes and variables. Results can be returned via similar variable place- 
holders. This allows these objects calls to access and niodify the environment. 

[0092] Objects can be qualified by attribute: execute that can take the values: parallel (executed in parallel, non 
blocking and it can notify on the fly effects on environment during its execution, prior to its completion), asynchronous 
(executed asynchronously, non blocking and notifies via event when completed to update the environment), blocking 
(the browser waits for completion of the object call, before updating the environment and continuing). 
[0093] Alt the information needed to distribute the processing is described in the above-referenced PCT intemational 
patent application identified as US99/22925 (attorney docket no. Y0999-113). filed on October 1, 1999 which defines 
an architecture and protocols that allow distribution of the conversational applications. As such, the intemational patent 
application describes how such distribution can be done and how tt allows, in the cun-ent case, to distribute the process- 
ing between a client browser and a server browser, as well as between local engines and server engines. This allows 
distribution of the processing of the input/output event across the network. 

(viii) Gesture IMenu 

[0094] The gesture menu is a special case of gesture select. Gesture menu is used for encapsulating dialogues that 
help the user navigate through different subparts of an application. The same effect can be achieved using gesture 
select, however, having an explicit menu gesture enables authors to provide more semantic information about the 
reason why the select gesture is being used. Notice that in the example below, element menu is equivalent to element 
select with attribute action set to link. 

Example: 

[0095] 

<menu name«"main"> 

<choice value="#query">Ask a question</choice> 



<clioice value="#browse">Browse available categories</choice> 
</menu> 

The value of attribute value in each choice specifies the URI target for that choice, 
(ix) Gesture User Identification 

[0096] The conversational gesture user^identification is used to encapsulate user login and authentication. It Is de- 
signed to be generic - and is specialized for specific user interaction environments via style rules. 
[0097] Sub-elements user and identify encapsulate conversational gestures for obtaining the user name and authen- 
ticatbn information. Element predicate provides the test for ascertaining if the user has authenticated successfully 
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Example: 
[0098] 

<user_icientif y naine=" login" 
require_predicate=^" yes " 
on^fail^^retry" 
node_ici="2'*> 
<inessage nocle_id*''3"> 

To use this service, you first need to login using your name 

and personal identification. 

</message> 

<us€r name="userid" 

node_id="4"> 

what is your user id? 

</user> 

<id€ntify name=''pin" 
node_id='M"> 

Please provide your user authentication. 

</identif y> 
<predicate> 
<condition> 

backend. authenticate (user id, pin) 

</condition> 

</predicate> 

<error> 



Sorry, login for <var name="userid"/> 

with identification <var name="pin"/> failed. 

</error> 

</user_identif y> 

Variations on this gesture can be useful, e.g., explicit distinction between an identification gesture (e.g.. identify who 
the person is), verification gesture (e.g., authentication of the claimant), speech biometrics (e.g.. U.S. Patent No. 
5,897.616): 

(x) Gesture Constrained Input 

[0099] CML provides a number of pre-defined diaiog components for obtaining user input such as dates and curren- 
cies. Typically, such input is more open^nded than the various selection gestures enumerated so far, and is realized 
In conventional visual interfaces via simple edit fields. However, encapsulating the domain-specific constraints for such 
input gestures is advantageous in constructing spoken interaction. Also, notice that such domalrhspecific constraints 
are typically implemented in toda/s WWW interfaces as client-side scripts within HTML pages that perform validation 
of user input before it is submitted to a server. In CML, we formalize those input gestures that are widely used on 
toda/s WWW for performing standard user-level tasks. CML also provides an extension mechanism that allows this 
basic set of input gestures to be extended over time. Note that all CML elements defined in this list aregesfures: 

(1) Date - Specify date 

(2) Time - Specify time. 

(3) Currency - Specify currency amount 

(4) Credit card • Specify a credit card (including card type, card number and expiration date). 

(5) Phone - Specify a telephone number. 

(6) Email - Specify an e-mail address. 

(7) URL -Specify a URL. 
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(8) Snail Address - Specify a "snail maiP address, including street, city/state/country and zip code. 

[0100] The constrained Input gesture can easily be extended by passing a grammar for other input fields. Note that 
this gesture can. In addition, be associated with transcoding rules that can be localized (i.e., internationalized and take 
regional flavors). This is a statement that is actually extendable to all the gesture and gesture based transcoding rules. 
Based on the location O-d-* calling number, origin of the IP address, preferences known about the user (on his local 
device/browser or transmitted through cookies to the server)), gestures can be expressed In another language (i.e., 
"Select yes or no" becomes "Selectionnez oui ou non" etc.) or adapted to the geography (e.g., zip code becomes postal 
code). 

(xi) Gesture Unconstrained Input 



[01 01] The conversational gesture Input Is used to obtained user input where the input constraints are more complex 
(or perhaps norvexistent). The gesture encapsulates the user prompt, application-level semantics about the Item of 
IS information being requested, and possibly a predicate to test the validity of the input. Note that gesture input along 
with application-specific semantic constraints provides a means to extend the set off built-in constrained input gestures 
discussed in the previous section. 



Example: 
[0102] 



< Input node_icl=''l"> 
25 <Message> . . . </Message> 

</Input> 



(xil) Gesture Submit 

30 

[0103] The conversational gesture submit specifies the components from the environment to be packaged up and 
returned by the containing CML instance. It also encapsulates the prompt to be used as welt as the target URt to which 
the encapsulated environment state is to be submitted. 

35 Example: 

[0104] 

^ <submit target="uri"> 

<env name="location. state ••/> 
<env name="location . city"/> 
</submit> 

45 Sub-element env specifies components of the environment to be submitted by the enclosing gesture. 

[0105] It is to be appreciate that while various CML attributes and components have been described above, other 
attributes and components will be presented and defined below In the course of describing further aspects of this 
embodiment of CML. It should be understood that other attributes and components may be defined in accordance with 
the teachings of the invention. That is. the Invention is not intended to be limited to the particular attributes and com- 

so ponents that are described in this detailed description. 

G. Binding Events 

[01 06] CML provides a flexible, extensible mechanism for application authors to define "logical input events" and the 
w association between such logical events and the actual "physical input events" that trigger the defined logical events. 
CML gestures declare logical events that they are prepared to handle via CML attribute trigger when a defined logical 
event is received, the closest encbsing gesture that has a matching event In its trigger list handles the event. The CML 
attribute trigger allows a gesture to be triggered by an event that is logically bound to it. This mechanism is best 



19 



EP1 100 013 A2 



illustrated by an example. In the fragment of CML cx)de shown belcw, the application defines help as a logical input 
event, binds this to physical events in two separate modalities, and finally declares a CML gesture to handle the help 
event. 

Example: 

10107] 

<cml name=" travel "> 
<bind-event logical="help'' 
modal ity="dtmf" 
physical="*"/> 
<bind-event logical="help" 

modality="qwerty" 
physical="h'V> 
<help name^'^help" 

trigger=''help"> 
Top-level application help 
</help> 

</cml> 

CML element bind-event takes three attributes: 

(1) logical - Spedfies the name of the logical event being defined. 

(2) modality - Specifies the interaction modality in which the event is being bound. 

(3) physical - Specifies physical event to bind to a logical event 

[0108] Input events that are not handled by CML gestures making up the application bubble up to the CML interpreter 
where standard platform events such as help are handled by a default handler. Bubble up means that search of a 
gesture that matches the trigger value is hierarchically bubbling up from the closest enclosing gesture to a higher one, 
until no gesture matches. In such a case, the trigger should be associated to a service offered by the browser, if not 
by the underiying platform (e.g.. conversattonal virtual machine of Y0999-111). If none are met. the event is ignored 
or a default message is returned to the user explaining that the input was not understood (or not supported) and ignored. 
These, however, are implementation choices of the browser and underiying platfonn. not choices of the language. Note 
that mechanism bind-event is designed to ovenide platform behavior - it is not meant to be used as the exclusive 
mechanism for mapping user input to CML gestures. Thus, using element bind-event to bind all valid spoken utterances 
in an application to the appropriate gestures is deprecated. 

[01 09] Further, note that omitting attribute modality in element bind-event results in associating the specified physical 
binding in all modalities. Omitting value of attribute physical in element bind-event declares a logical event that is 
unbound, i.e., not bound to a physical event. 

H. Grouping Gestures And Defining Focus 

[01 1 0] Conversational gestures when rendered to specific modalities to realize a specific user interface are grouped 
appropriately to allow the user to interact with related portions of the Interface. To understand this assertion, consider 
WWW applications that split the user interaction across several HTML pages, with related portions of the interface 
appearing on the same page. Similarty. speech interfaces allow users to specify any one of several related commands 
at a given time. 

[0111] This form of grouping of gestures is best captured at the time the application is being authored. Such grouping 
may or may not be modality independent; CML allows application authors to encapsulate both forms of grouping. 
[0112] Conversational gestures are grouped using the CML element group. Element group is further qualified by 
attributes id, modality and class. Attribute id is minimally required to group gestures. Attribute modality, if present 
declares the spedfied grouping to be modality specific. Attribute dass can be used in a manner analogous to the HTML 
class attribute to enable further selection of related elements whilst transcoding CML to languages like HTML. 
[01 1 3] By defaulL CML gestures endosed in a single group element map to a user interface which enables the user 
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to interact with any of ttie contained gestures - in the case of HTML, this results in the gestures being transcoded into 
a single page; in the case of VoiceXML, this results in the conresponding forms being made active in parallel. 
[0114] Note that to activate groups of gestures in parallel is the way to implement mixed initiative NL interfaces: each 
command/query supported at a given time is characterized by a form built out of gestures (i.e.. a group of gestures is 

s called a fbnn). When an input/output event occurs, the dialog manager provided by the browser or underlying platform 
will guess what are the gestures in the different forms that are activated and they allow to qualify their associated 
attributes (the environment variables associated to the gestures). When all the mandatory attributes of a form have 
received a value, the action is considered as disambiguated and executed. Note that extra constraints between the 
attributes can be expressed using XFORMS, as will be explained below. See also the above referenced patent appli- 

10 cation identified by attomey docket no. Y0998-392 for discussion on parallel activation, and K.A. Papineni et al.. "Free- 
flow dialog management using forms." Proc. Eurospeech, 1999, and K. Davies et a!., "The conversational telephony 
system for financial applications," Proc. Eurospeech, 1999. 

[0115] Instances of the element group cannot nest unless the inner group element specifies a value for attributes 
modality or class that is different from that specified in the enclosing element. 

IS [01 1 6] Efforts like XFORMS (http://www.w3.org/MaricUp/Fonns/) have attempted to solve problems associated with 
existing maricup languages by splitting forms into three layers (presentation, logic and data), as shown in FIG. 4, in an 
attempt to facilitate replacing the presentation for different kinds of browsers (however, XFORMS fails to address 
different modalities), while preserving the same backend. XFORMS data layer allows the application developer to 
define the data model for the form. The developer can use built-in data types or roll his own. XFORMS are building 

^ the data types on top of the work being done on XML Schemas. The logic layer allows the application developer to 
define dependencies between fields, for example, for mnning totals, or where one field requires another to be filed in. 
XFORMS supports a light-weight expresston syntax, building upon wktespread femlliarity with spread sheets and ex- 
isting fonns packages. The application developer is still able to call out the scripts, when extra flexibility is needed. The 
presentation layer is consists of markup for forms controls and other HTML maricup, where each control is bound to a 

25 field in the data model. "Getter" and "setter" functions allow the presentation to match the user's preferences, e.g., for 
dates and currencies, while retaining a canonical representation intemally, thereby simplifying fomi processing. The 
same data field can have more than one presentation control bound to it. Changing the value in any of the controls 
then automatically updates all of the others. 

[0117] As explained herein, XFORMS provides a back-end mechanism for separating out data firom presentation. 

30 CML provides a mechanism to further separate the logic and presentation pari into presentation rendering (i.e., mo- 
dalify-dependent rendering with no interaction information)/interaction (plus possible modalify dependent cosmetic 
inputs)/content (i.e., backend data plus logic information minus all the interaction related logic components). This in- 
ventive concept is illustrated in FIG. 5A. As previously explained, the programming paradigm of the invention separates 
presentation/modalify specific rendering A, interaction B, and content and backend/application logic C. FIG. 5A also 

35 illustrates the backend mechanism of XFORMS, as mentioned above, where data D is separated from the backend E. 
FIG. 5B represents how a form based mixed Initiative NLU (natural language understanding) application Is written in 
CML. Leters A and C denote the same items as in FIG. 5A. In the block denoted as B'. the interaction/dialog information 
is described in terms of CML. This part describes the interactions (mandatory and optional) that need to occur in order 
to realize each of the activable transactions. To this you add an XFORMS component that captures constraint and data 

40 models associated to the underiying attribute data structure. The engine control and cosmetization part capture addi- 
tional control parameters that are used to optimize the behavior of the conversational engines, in particular the dialog 
manager and NLU engines. Note that the CML portions can be used for rendering in other modalities as described 
eariier. Block F denotes an exemplary form (e.g.. a mutual fund demo form) that may be employed in accordance with 
block B*. 

45 

1. Data Model And Data Types 

[0118] CML defines data-model or data-fype primitives in the manner specified by the results of the W3C wortc on 
XML Schema and XML fonms, see http://www.w3.org. 

so 

J. Accessing Environment 

[0119] CML gestures define a collection of variables collectively called the "environment." fi^ the CML document Is 
traversed, variables in the environment are bound to the values resulting from successful user interaction. The envi- 
55 ronment can be accessed and manipulated within CML gestures via elements var, value and assign, as will be explained 
below. Note that all such names are always fully qualified. 

(i) var - Element var declares (and optionally) initializes a variable (assigns it an initial value) in the current envi- 
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ronment Attribute name specifies the variable name. An initial value may be specified using the same syntax as 
specified for element assign, see below. 

(li) assign - Element assign assigns a value to a variable that already exists In the environment That Is. element 
assign is used to bind values in the environment Attribute name specifies the variable to be bound. The value to 
be bound may be specified either as the value of attribute expr using the same expression syntax as used by 
xpath; alternatively, the value to be assigned may be specified as the contents of element assign. Element assign 
Is typically used to bind or update intermediate variables that are not set by direct user interaction. 

(ill) value - Element value retrieves the value of a defined variable. That is. attribute name of empty element value 
specifies the variable whose value Is to be looked up in the environment Value of attribute name may be a partially 
or fully qualified name (see section above on Namespaces) and is interpreted in the context of the containing CML 
gesture. 

[0120] Note that as defined above, variables must be declared before they can be assigned. 
K. CML Traversal Model 

[0121] Infoware authored in CML is hosted by a conversational shell that mediates amongst multiple user agents - 
hereafter referred to as the CML interpreter. It is to be appreciated tiiat the traversal model will be further discussed 
and illustrated in the context of FIGs. 10 and 11. User interaction proceeds by the CML interpreter mapping CML 
instances to appropriate modality-specific languages such as HTML and VoiceXML. These modality-specific repre- 
sentations are-handed to the appropriate user agents which render modality-specific versions of the dialog. 
[01 22] The transformation from CML to modality-specific representations is preferably governed by XSL transforma- 
tion rules (XSLT). Note that other transformation mechanisms can be used. XSLT Is merely a method proposed for a 
prefen-ed embodiment. For example, JSP - Java Server Pages or Java Beans can be used, as well as other techniques 
which transform, based on rules, the gestures to their target rendering. An example of such implementation is: for each 
gesture, we associate a java bean. The Java bean canies its own rendering in each modality (through JSP). Thus, the 
Invention is not limited to XSLT. In any case, these XSL mies are modality-specific. In the process of mapping the CML 
Instance to an appropriate modality-specific representation, the XSL rules add the necessary Information needed to 
realize modality-specific user interaction. As an example, when translating element select to VoiceXML. the relevant 
XSL transfonnation rule handles the generation of the grammar that covers the valid choices for that conversational 
gesture. 

[0123] The process of transforming CML instances to modality-specific representations such as HTML may result 
in a single CML node mapping to a collection of nodes In the output representation. To help synchronize across these 
various representations. CML attribute node.id is applied to all output nodes resulting from a gh/en CML node. When 
a given CML Instance.is mapped to different representations, e.g.. HTML and VoiceXML by the appropriate modality- 
specific XSL rules, the shape of the tree In the output is likely to vary amongst the various modalities. However, attribute 
node. id allows us to synchronize amongst these representations by providing a conceptual backlink from each mo- 
dality-specific representation to the originating CML node. In the above-referenced U.S. provisional patent application 
(attorney docket no. Y0999-178), a description Is provided of how to develop a platform (the multi-modal shell) able 
to support tight multi-modal applications. The mechanism operates as follows. Each modality registers witti the multi- 
modal shell the commands that it supports and the impact that their execution will have on the other registered mo- 
dalities. Cleariy. in the current case, upon parsing the CML page and transcoding the gestures, each gesture is kept 
in a data structure (i.e.. the table) in the multi-modal shell Upon an 1/0 event in a given modality, the node.id Information 
is used to find the activated gesture and from the table (i.e.. the CML document dialog ti-ee), it is immediate to find the 
effect on the activated modality as well as ihe other modality (i.e., update of each view or fetch of a new page on the 
CML server). 

[0124] As user interaction proceeds, variables defined In the environment by the current CML instance get bound to 
validated values. This binding happens first in one of the registered modality-specific user agents. The registered user 
agent sends an appropriate message to the conversational shell comprising of the updated environment and the 
node.kj of the gesture that was just completed. Once the updated binding has been propagated to the CML Interpreter, 
it messages all registered user agents with the node_id of the gesture just completed. Registered user agents update 
ttieir presentation upon receiving this message by first querying the CML interpreter for the portion of the environment 
thai affects their presentation. 
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L Transfbnning CML to Specific User Interface Languages 

[0125] CML is transformed into user interface (ui) specific encodings, e.g.. HTML, via transformation rules expressed 
in XSL This section t)egins with some bacl(ground material on XSL transformations and then presents examples on 
s how XSL is used in the context of CML and multi-modal browsers according to the invention. 

(i) XSL Transformations Background Information 

[0126] The W3C XSL transformations (xslt) specification has been released as a Proposed Recommendation: XSL 
10 Transformations (xslt) VBrsion 1.0, reference: W3C Proposed Recommendation 8-October-1999. edited by James 
Clark, the disclosure of which is incorporated by reference herein. The above-referenced W3C Proposed Recommerv 
dation is part of the W3C Style activity. Specifically, the xstt specification defines the syntax and semantics of xslt. 
which is a language for transfonning XML documents into other XML documents, xslt is designed for use as part of 
XSL, which is a stylesheet language for XML. A transformation In the xslt language is expressed as a welt-fbrmed XML 
IS document conforming to the Namespaces in the XML Recommendation, which may include both elements that are 
defined by xslt and elements that are not defined by xslt A transfbmnation expressed in xslt describes rules for trans- 
forming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern 
is matched against elements in the source tree. A template is instantiated to create part of the result tree. The result 
tree is separate from the source tree. The structure of the result tree can be completely different from the structure of 
20 the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary 
structure can be added. A fransformation expressed in xslt is called a stylesheet. The xslt specification is available in 
both XML and HTML formats. 

(li) XSL Transformations Examples 



[0127] The following are coding examples illusfrating CML code. XSL transformation rules, and the HTML, WML and 
VoiceXML code resulting from the respective fransfonmations. 

[0128] The following code illustrates a full example of a page written In CML and the different gesture-based XSL 
rules that have been used to produce legacy ML pages (respectively, HTML. VoiceXML and WML). Each page is 
30 associated to a particular rendering as illustrated by the following figures. The example is of a site that offers access 
to different infonnation services: News. Business. Sports. Travel. Weather and Show Business. 

(a) CML Code 

35 [0129] This describes the source CML page associated with the example: 
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<!-$ld: cnn.cmUv 1.19 2000/02/01 Exp $-> 
<!~Description: CNN Mobile In cml -> 

<cml name-'cnn" 
nodeJd='-l" 

titlc=-CNN Mobile News"> 
<inenu name="cnn .command" 
nodeJd="2" > 
<cboices nodeJd="3" > 

, <default value="#cnn. query ">Select News Stories</defauIt> 
<choice value="#cnn.exit" 

require_confirmation="tnie-> 
Exit </choice> 

<choice value="#cnn.applicationHelp">Help</choice> 
</choices> 

</menu> 

<cml name="cnn.appIicationHelp" 
tide="About CNN Mobile" 
nodeJd="4" 
action=*'retum"> 
<message 



nodeJd=="5*' > 
This application allows you to select and view CNN news stories 
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</message> 
<ycml> 

<cml name="cnn.exit'' 
nodeJd=*-6" 

title="Exit CNN Mobile News" 
action="submit"> 
<messaBe nodejd="60"> 
Tbankyou for using the CNN news service 
</message> 
</cml> 

<groiip nodeJd="7" 

groupId="query"> 
<cml name="cnn.query" 

title="Search CNN Mobile News" 
nodeJd="8" > 
<menu name="cnn.query.topic" 
nodeJd="ir 
title=*Topic Selection "> 
<choices nodeJd="12" > 

<choice value="#cnn.queTy.news"> News </choice> 
<choice value="#cnn.queiy.business'*> Business </choice> 

<choice value="#cnn. query. spoTts"> 
<granimar> (sport | sports" </grammar> 
Sports 

</choice> 

<choice value- *#cnn.query.travel"> Travel </choice> 
<choice value="#cnn.queTy.weather"> Weather </choice> 
<choice value="#cnn. query. show"> 
<graniniar > show [business] </granimar> 
Show Business 
</choice> 
</choices> 
</menu> 
</cml> 

<cml namc="cnn.query.news" 
title="News Channel" 
nodeJd="13" 
action="subnnit"> 
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<select name="cnn.query.part"> 
<message nodeJd="9** > 

Which part of today's news would you like to read?</inessage> 
<choices 

nodeJd="IO"> 
<choice value="h"> Headlines</choice> 
<choice value=" 1 "> first story </cboice> 

<choicc vaiue="2"> second story </choice> 
<choice value="3"> third story </choice> 
</choices> 
</select> 

<select name="cnn.query.interest"> 
<message nodeJd=" 14" > 

Which news category would you like to read? 
</message> 

<choiccs nodeJd=" 1 5" > 
<choice value- 'business"> 
<graminar type="text/jsgf *> 
business {BIZ}</grammar> 
Business 
</choice> 

<choice value="africa"> 
Africa</choice> 

<choice value="world"> World </choice> 

<cboice value="United statcs"> United sutes </choice> 

<choice value="europe"> Europe </choice> 

<choice value="Asia"> Asia</choice> 

<choice value="me"> Middle East</choice> 

<choice value="america"> America </choice> 

</choices> 
</select> 
</cml> 

<cml naine="cnn.query.business" 
titlc="Business Channel" 
action="submit" 
nodeJd=n6"> 
<select nanne="cnn.query.parl"> 
<message nodeJd="9" > 

Which part of today's news would you like to read?</message> 
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<choices 

nodejd="10"> 
<choice value- 'h'^ Headlines</choice> 
<choice valuer" 1 "> first story </choice> 
<choice value="2"> second story </choice> 
<choice value="3"> third story </choice> 
</choices> 
</select> 

<select nanie="cnn.query.interest"> 
<message nodeJd="17"> 

Which business category would you like to read?</message> 
<choices nodeJds^'lS'^ 
<choice value='T>JEWS-> news </dioice> 
<choice value=-lN"> indexes </choice> 
<choice value="CU"> exchange rates </choice> 
<choice value="MET"> metals </choice> 
</choices> 
</select> 
</cml> 

<cml name="cnn.query. weather" 
title="Weather Channel" 
action="submit" 
nodeJd="19"> 
<select naine="cnn.query.part"> 
<message nodeJd="9" > 

Which part of today's news would you like to read?</message> 
<choices 

nodejd="10" > 

<choice value="h"> Headluies<:/choice> 

<choice value="l "> first story </choice> 

<choice value="2"> second story </choice> 

<choice value="3"> third stoiy </choice> 
</choices> 
</select> 

<select nanie="cnn.query.interest"> 

<message nodejd="20"> 

Which region are you interested in?</message> 
<choices nodeJd="2r'> 

<choice value="us"> United states </choice> 

<choice value="europe"> 
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<graininar type="text/jsgr> (euro | Europe) </'grammar> 
Europe 
</choice> 

<choice value="JP"> Japan </choice> 
<choice value=*'AU"> Australia </choice> 
<choice value="AS"> Asia </choice> 
</cboices> 
</select> 
</cml> 

<cnil nanie="cnn. query, travel" 

title="Travel Section" action="submit- 
nodeJd="522" > 
<select namc=''cnn.query.part-> 
<message nodeJd="9*' > 

Which pan of today*s news would you like to read?</message> 
<choices 

nodejd="10"> 
<choice value="h"> Headlines<;/choice> 
<choice valuer" !'•> first story </choice> 
<choice value="2"> second story </choice> 
<choice value="3'> third story </choice> 
</choices> 
</select> 

<select name="cnn.query.interest"> 

<message nodeJd="23-> 
Which dty do you want to visit?</message> 

<choices nodeJd="24*'> 
<choice value="AMSTERDAM">AMSTERDAM</choice> 
<choice valuc="COPENHAGEN">COPENHAGEN</choice> 
<choicevalue="HELSlNKI">HELSINKl</choice> 
<choice value=-HONGKONG">HONGKONG</choice> 
<choice value="LONIX)N">LONDON</choice> 
<choicc value="OSLO">OSLO</choice> 
<choice value="PRAGUE">PRAGUE</choice> 
<choice vaiue="SnMGAPORE">SINGAPORE</choice> 
<choice value="STOCKHOLM">STOCKHOLM</choice> 
<choice value="SYDNEY">SYDNEY</choice> 

</choices> 
</select> 
</cmI> 
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<cinl naine='*cnn.query .sports" 
action="submit" 
title="Sports Channel" 
nodeJd="25" > 
<select naine="cnn. query .part"> 
<message nodeJd="9" > 

Which pan of today's news would you like to read?</mes5age> 
<choices 

nodejd="10"> 
<choice value- 'h"> Headlines</choice> 
<choice value="l "> first story </choice> 
<choice value="2"> second story </choice> 
<choicc value="3"> third stoiy </choice> 
</choices> 
</select> 

<seiect name="cnn.queTy.interest"> 
<message nodeJd="26"> 
What sports are you interested in?</message> 
<choices nodeJd=-27''> 
<choice value="AS"> Asia </choice> 
<choice value="w'*> world </choice> 
<choice value="eu**> europe </choice> 
<choice value="us"> united states </choice> 
<choice value="nba"> NBA </choice> 
<choice value="nkr> nbl </choice> 
<choice value="EF"> Europoean football </choice> 
</choices> 
</select> 
</cfnl> 

<submit target="htUp://ranian.almadenjbni.com/cgi-bin/cnn.cgi"> 
<message nodeJd="28"> 

executing <value name="cnn.conunand"/> 

for <value name="cnn,queTy.part7> 

stories about <value name= *'cnn.queTy.interest"/> 

from topic <value name="cnn.quefy.topic'V> 
</message> 

<env name="cnnxommand"/> 
<env nanie="cnn.query.topic7> 
<env name="cnn. query. interest7> 
<env nanie="cnn.query.paTt'7> 
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</subTnit> 
</group> 

<submit tai^»et="http://raman.almadenJbmxom/cgi-bin/cnn.cgi"> 
</submit> 
</cml> 



(b) Gesture XSL 

[0131] The following example Illustrates the CML to HTML gesture-based XSL rules that are used to transcode, 
gesture by gesture, a CML page into an HTML page. All the gesture-based transcoding rules required to transcode 
any possible CML page are not present. It is to be considered as an illustration of the method. The XSL syntax follows 
conventional XSLT rules, see. e.g.. http://www.w3.org/1999/XSUTransfbrm. 



!.-$Id: cml2htinljtel,v 1.8 1999/11/12 20:01:1 1 $-> 
<! -Description: Transform CML to HTML -> 
<xsl:stylesheetxmlns:xcl="http://www. w3.org/1999/XSL/TransfonTi" 

xmlns:xt="htCp://www.jclaric.com/xt" 
version="1.0" 

extension-elenient-prefixes="xt"> 

<xsl:include href^"html/cnil.xsr7> 

<xsl : include href="htinl/en vironment.xsl "/> 

<xsl:includehref="html/output.xsr'/> 

<xel: include href="html/selections.xsr*/> 

<xsl:include href="common/identity.xsr'/> 

</xsl:stylesheet> 

<!"$ld: cmLxelv L13 2000/01/31 Exp $-> 

<!-I>escription: Translate CML element to HTML -> 

<!- Handle case of CML element being the top-level element — > 

<xsl:stylesheet 

xmlUs:xsl=*'http://www.w3.org/!999/XSL/Transfonm"> 
<xsl:output method="htmrv> 
<xsl:template matcb=7cml"> 
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<html> 
<head> 

<META hnp-equiv="Content-Type" content="text/htnil; 
charset=iso-8859-r7> 

<titleXxsl:value-of select="{a^tille"/x/title> 
</head> 
<body> 

<hl> 

<a nanie=" { fo^name } "> 
<xel:value-of select="^title'7> 
</a> 

<hl> 

<xsl:choose> 

<xsl:when test="@actioTi- submit' "> 
<form> 

<xsl:attribute name="nodeJd"> 
<xsl:value-of seIect="(2^odeJd7> 
</xsl:attribute> 
<xsl:attribuie name="action"> 
<xsl:value-of select="submit/@target"/> 
</xsl:attribute> 
<xsl: apply-templates/> 
<p> 

<INPUT TYPE="SUBMTT" VALUE="@name7> 
</p> 
<;/fonn> 
</xsl:wheD> 
<xsl:olherwise> 
<di V nodeJd=" { @nodeJd \ " 

nanie=" { @name } '*> 
<xsl:apply-teniplates/> 
</div> 
</xsl:otherwise> 
</xsl:choose> 
</body> 
</btml> 
</xsl:tempIate> 

<xsl:template niatch="cml [(lMjaction=*submit']"> 
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<h2> <a name=" { ^name} "> 

<xsl:value-of select='Yg^tiile"/> </a> 
</h2> 
<forni> 

<xsl:attribute name="nodeJd''> 

<xsl:value-of select='%nodeJd"/> 
</xsl:attribute> 
<xsl:attribute naine="action"> 

<!- for rea, we should process submit node to 

cons up target uri — > 

<xsl : value-of select=*' . ./subniit/^target"/> 
</xsl:attribute> 
<xcl:apply-teniplates/> 
<P> 

<INPUT TYPE="SUBMTT" VALUE="{@namer/> 
</P> 
</form> 
</xsl:template> 

<xsl:tennplate match="cml"> 

<h2 nodeJd=" { @nodeJd } "> 

<a name=" { (o^ame \ "> 

<xsl:value-of select="^title"/> </a> 
</h2> 

<xsl:appiy-templates/> 
<xSl:if test="(^ction='retum"'> 
<p> 

<a nanie="{concat('#'. /cml/(giname)}"> 
Back 
</a> 
</p> 
<;/xsl:if> 
</xsl:tennplate> 

<xsl:template match-' group"> 
<div group]d=" { ^groupld } " 
modality=" { (ajmodality \ " 
class="{(a';class|"> 
<xsl:apply-templates/> 



32 



EP1 100 013 A2 



</div> 
</xsl:template> 

<xsl:template match='*submit"/> 
</xsl:stylesheet> 

<!-$ld: environment.xsl.v 1.2 2000/02/01 Exp $ 
<!-Description: Process CML environment constructs -> 

<xsl:stylesheet 

xmlns:xsl="http://www.w3.org/1999/XSL/Transforni' 

<xsi:teniplate match="final"> 
<xsl:apply-templates/> 
</xsl:tcmplate> 
<xsl:template match="var"> 

<input type="hidden" naine="{@name}" value="{-@expr}7> 
</xsI:template> 
<xsl:template match="assign"> 
<input name="{(isname}" type="hiddcn"> 
<xsl:attribute nanie="value-> 
<xsl:choose> 

<xsl:when test="@expr=""> 
<xsl:value-of select="./node()7> 
</xsl:when> 
<xsl:otherwise> 
<xsl:vaIue-of select="(^expr"/> 
</xsl:otherwise> 
</xsI:choosc> 
</xsl:attribute> 
</input> 
</xcl:tennplate> 
<xsl:template niatch="value"> 
<bxxsl:value-of select="@name"/></b> 
</xsl:tempIate> 
</xsl:stylesheet> 

<!-$Id: output.xsl.v 1.3 1999/1 1/12 20:07:23 Exp $-> 

<!- Description: Transformation rules for CML gestures that -> 

<!— primarily output information -> 
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<xsl:stylesheet xinlns:xsl=''http://www.w3.org/l 999/XSL/Transfoim'*> 
<xsl:teniplate match="inessage'*> 
<P> 

<xsl:attribute name="nodeJd"> 
<xsl:value-of select="@nodejid"/> 
</xsl:attribute> 
<xsl: apply-lemplates/> 
</P> 
</xsl:teniplate> 

<!— evaitually generate pop-up hdp via javascript — > 
<xsl:template inatch="help"> 
<P> 

<xsl:attribute name="node_id**> 
<xsl:vaIue-of select="@node_id"/> 
</xsl:attribute> 
<xsI:apply-templates/> 
</?> 
</xsl:teinplate> 
</xsl:stylesheet> 

<!.-$Id: selections.xsl,v 1.8 2000/01/31 17:50:34 $-> 

<! -Descriptions: Transform CML selection gestures to HTML -> 

<xsl:stylesheet 

xnilus:xsl="http://www.w3.org/1999/XSL/Transfonn"> 

<xsl:template match="menu"> 
<xsl:iftest="@title!="''> 
<h2> 

<a name="#{@name}"> 
<xsI:vaIue-ofselect="@title7> 
</a> 
</h2> 
</xs!:if> 

<xsl:apply-templatcs select="message"/> 
<oI node Jd=" { ^ode Jd } "> 
<xsl:for-eachselect="choices/choice|choices/default"> 
<Ii> 
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<ahref="{(aivalue}"> 

<xsl:apply-templates/> 
</a> 
</Ii> 
</xsl:for-each> 
<Jo\> 
</xsl:teinplate> 

<xsl:template matcb="select"> 

<xsl:apply-templates select="message"/> 
<select nainc="{@name}-> 

<xsl:apply-templates select=-choices-/> 
</select> 
<P/> 

</xsI:tempIate> 

<xsl:template matdi="choices"> 

<xsl:{4>ply-templates/^ 
</xsl:teniplate> 

<xsl:teniplate match="choice|default'> 
<option> 

<xsl:attribute naine="value**> 
<xsl:value-of sclect="@value"/> 
</xsl:attribute> 

<xsl:iftest="nanie(.)=*defaulf"> 
<xsl:attribute nanie=-checkcd"/> 
</xs\:\f> 

<xsl:apply-templates/> 

</option> 

</xsl:tcmplate> 

<xsl:template inatch="granimar" /> 
</xsl:stylesheet> 

<\-$ld: identity .xsl,v 1.1 1999/1 1/08 18:05:26 Exp $-> 
<!- Description: Identity transform for use in other sheets-> 

<xsl:stylesheetxndns:xsl="http://wvw.w3.org/1999/XSL/Transform'> 
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<xsl:template match="*|@*"> 
<xsl:value-of select=''.-/> 

<xsl:copy> 

<xsl:apply-templates select="@**'/> 
<xsl:apply-teinplates select="node()"/> 
</xsl:copy> 
</xsl:template> 
</xsl:stylesheet> 



(c) HTML Sources 

[01 32] The following describes the HTML source page obtained by applying the (CML to HTML) XSL sources on the 
CML source page. The resutting welcome GUr page as viewed with a HTML browser is illustrated In FIGs. 6A through 
6C. 



<!DOCTYPE html PUBLIC "-/M HO/DTD HTML 4.0 Transitional//EN*> 

<html> 

<head> 

<META http-equiv=" Content-Type" content^ "text/html; charset=iso-8859- 1 " > 

<title>CNN Mobile News</title> 

</head> 

<body> 

<hl> 

<a name='*cnn">CNN Mobile News</a> 
</hl> 

<divnodeJd="l" nanie="cnn"> 

<ol nodeJd="2"> 

<li> 

<a href="#cnn.queTy">Sclect News Stories</a> 

</\\> 

<li> 

<a href="#cnn.exit"> 
Exit </a> 

</li> 
<li> 

<a href="#cnn.applicationHelp">Hdp<ya> 

</li> 
</ol> 

<h2 nodeJd="4"> 
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<a name- *cnn.applicationHelp">About CNN M obile</a> 
</h2> 
<P nodeJd=" 5 > 

This application allows you to select and view CNN news stories 
<J?> 

<P> 

<a names'^Scnn'^ 
Back 
</a> 

</p> 
<h2> 

<a namc="cnn.exif'>Exit CNN Mobile News</a> 
</h2> 

<fonn nodeJd= " 6 " 

action=='littp://i^an.almaden.ibn).coni/cgi-bin/cimxgi?comnmd=e^ 
<P nodeid="60"> 

Thankyou for using the CNN news sCTvice 
</P> 
<p> 

<1NPUT TYPE="SUBMir VALUE=''cnn.exir> 

</p> 

</form> 

<div groupld="query" modality='"* class="-> 

<h2 nodeJd="8"> 
<a nanie="cnn.query">Search CNN Mobile News</a> 
</h2> 

<h2> 

<a nanie="#cnn.query.topic">Topic Selection</a> 
</h2> 

<ol nodeJd="U-> 
<i> 

<a href="#cnn.query.news"> News </a> 

</li> 

<li> 

<a href^"#cnn.query.business"> Business </a> 

</li> 
<li> 

<a hrcf="#cnn.query.spoTts"> 
Sports 
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</a> 

</]i> 
<li> 

<a hTef="#ciin.query.travel'> Travel </a> 

</li> 

<i> 

<a href="#cnn.query.weather''> Weather </a> 

</li> 

<li> 

<a href="#cnn.query.show"> 

Show Business 
</a> 

<Ai> 
</ol> 

<h2> 

<a name="cnn.queiy.news">News Channel<;/a> 

<na> 

<fonn nodeJd=" 13*' 

action=**http://raman.abnadenjbm.com/cgi-bin/cnnxgi?comn^ 
<P nodeJd="9"> 
Which pan of today's news would you like to read?</P> 
<select nanie="cnn.queiy.part"> 

<option value="h*'> Headiines</option> 
<option value="r> first story </option> 
<option value=*'2"> second story </option> 
<option value="3"> third story </option> 
</select> 

<P> 
</p> 

<Pnodeid=-14'> 

Which news category would you like to read? 

</P> 

<select name= " cnn.query.interest" > 
<option value="business"> 

Business 

</option> 

<option value="africa*'> 
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Africa</optioii> 

<option value="world"> World </optioii> 
<option value="United states"> United states </option> 
<option value=-europe''> Europe </option> 
<option value="Asia"> Asia</option> 
<option value="me"> Middle East</option> 
<option valuc=*'america"'> America </option> 
</seIect> 
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<p> 
</p> 
<p> 



<INPUT TYPE="SUBMir VALUE=-cnn.query.news-> 
</p> 
</fonn> 
<h2> 

<a name= CDn.query .business** >Business Channel<Va> 
</h2> 

<form nodeJd=" 16 " 

action=''http://raman.alniadenjbmxom/cgi-bin/cnnxgi?comniand=search** > 
<P nodeJd="9"> 

Which part of today's news would you like to read?</P> 
<select name="cnn.queiy.part"> 

<option value="h"> Headlines</option> 
<option value=" 1 " > first stoiy </option> 
<option value="2"> second story </option> 
<option value=-3"> third story </option> 
</select> 



<P nodeJd="17"> 

Which business category would you like to read?</P> 
<select naine="cnn.queTy.interest'* > 

<option value="NEWS"> news </option> 
<option value='*IN'*> indexes </option> 
<option value=-CU"> exchange rates </option> 
<option value="MET*> metals </option> 
</select> 
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<p> 
</p> 
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<p> 
</p> 
<P> 
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<INPUT TYPE="SUBMir VALUE=- cnn.query.biisiness" > 
</p> 
</fonn> 
<h2> 

<a namc="cnn.queiy.weather">Weather Channcl</a> 
</h2> 

<fomi nodeJd= " 19 - 

action="http://i^an.almadenj*bm.coin/cgi.bin/cnnxgi? 
<P node id="9"> 
Which part of today's news would you like to read?</P> 
<select nanie="cnn.query.part"> 

<option value="h"> Headlines</option> 
<option value= '* 1 " > f-rst stoiy </option> 
<option value="2-> second stoiy </option> 
<option value="3"> third stoiy </option> 
</select> 

<p> 
</p> 

<P nodejd="20"> 

Which region are you interested in?</P> 
<select nanie= "cnn.query.interest" > 

<option value="us"> United states <yoption> 

<option value="curope'*> 

Europe 
</option> 

<option value="JP"> Japan </option> 
<option value="AU"> Australia </option> 
<option value="AS"> Asia </option> 
</select> 

<p> 
</p> 
<P> 

<INPUT TYPE="SUBMIT" VALUE="cnn.qucry.weather"> 
</p> 
</form> 
<h2> 

<a name="cnn.queTy.traveI">Travel Section</a> 
</h2> 
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<form nodejd^" 22 " 

action=**http://rainan.almaden.ibm.coin/cgi-biii/ci^ 
<P nodeJd="9"> 
Which part of today's news would you like to read?</P> 
<select name="cnn.queTy.part'*> 

<option value="h**> Headlines</option> 
<option value=" 1 " > fiisi stoiy </option> 
<option value="2"> second story </option> 
<option value="3"> third story </option> 
</select> 

<p> 
<Jp> 

<P nodeJd= " 23 > 
Which city do you want to visit?</P> 
<select nanie="cnn.queTy.interest"> 

<option value="AMSTERDAM">AMSTERDAM</option> 
<option value= - COPENHAGEN " >COPENHAGEN</option> 
< option value = HELSINKI '* >HELSTNKl</option> 
<option value="HONGKONG">HONGKONG</option> 
<option value="LONIX)N" >LONDON</option> 
<option value= " OSLO " >OSLO</option> 
<option value="PRAGUE">PRAGUE</option> 
<option value= " SINGAPORE " >SINGAPORE</option> 
<option value="STOCKHOLM" >STOCKHOLM</option> 
<option value="SYDNEY">SYDNEY</option> 
</select> 

<p> 
</p> 
<p> 

<INPUT TYPE="SUBM1T" VALUE="cnn,qucry.traver> 
</p> 
</form> 
<h2> 

<a name="cnn.query.sports">Sports ChanneKa> 
</h2> 

<foTm nodeJd= " 25 " actions 

"http://rarnan.alinaden.ibin.com/cgi-bin/cnn.cgi?coTnmand=search" > 
<P nodeJd="9"> 
Which part of today's news would you like to read?</P> 
<select name="cnn.query.part"> 
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<option value="h*'> Headlines</option> 
<optioTi value= " 1 " > first story </optioTi> 
<option value="2"> second story </option> 
<option value=-3-> third story </option> 
</select> 

<p> 
</p> 

<P nodeJd="26"> 
What sports are you interested in?</P> 
<select naine="cnn.q\iery.interesf> 

<option value="AS"> Asia </option> 

<option value="w- > world </option> 

<option values^eu'^ europe </option> 

<option value="us"> united states </option> 

<option value="nba*'> NBA </option> 

<option value="nhr> nhl </option> 

<option value="EF"> Europoean football </option> 
</select> 

<p> 
</p> 
<p> 

<INPUT TYPE="SUBMIT" VALUE="cnn.queiy.sports"> 

</p> 

</fonn> 

</div> 

</div> 

</body> 

</html> 

(d) Gesture XSL 

[0133] The following example illustrates the CML to WML gesture-based XSL rules that are used to transcode, ges- 
ture by gesture, a CML page into a WML page. All the gesture-based transcoding rules required to transcode any 
possible CML page are not present. It is to be considered as an Illustration of the method. The XSL syntax follows 
conventional XSLT rules, see, e.g., http://www.w3.org/1999/XSL/TransforTn. 
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<!- $ld: cinl2htinUsl,v 1.9 2000/02/05 19:32:40 Exp $ -> 

<!- Description: Transfonn CML to HTML -> 

• <xsl:stylesheet xmlns:xsl="http://www.w3.oi^l999/XSiyrran5fonn" 

xmlns:xt=*'bttp://www.j€larlLCoiii/xt*' version="l.O" extension-dement-prefixes="xt"> 

<xsl:include hTef=-htiiil/cinLxsr /> 

<xsl:include hrei^"html/environnientxsP > 

<xsl:include href="htinl/niodality.xsr t> 

<xsl:include hrei^"htinl/outputxsr /> 

<xsl:include hrei='*btnil/selectionsJLsr l> 

<xsl:include href^*'€ommoii/identity.xsr /> 

</xsl:stylesheet> 

<!-$ld: cinl.xsl,v 1.13 2000/01/31 Exp 

<!-Description: Translate CML element to HTML — > 

<!- Handle case of CML element being the top-level element — > 

<xsl:stylesheet 

xmtas:xsl="http://www.w3.org/1999/XSL/Transforai''> 

<xsl:output method="htmr/> 

<xsl:template match="/cml"> 

<btml> 

<head> 

<META http-equiv="Content-Type*' content="text/html; charset=iso-8859-r/> 

<titlexxsl: value-of select="@title7X/title> 

<;/head> 

<body> 

<hl> 

<a name=**{(a|name>"> 

<xsl: value-of select="@title-/> 

</a> 

</hl> 

<xsl:choose> 

<xsl:when test="@action=*submit"'> 
<form> 

<xsl:attribute name="nodeJd"> 
<xsl:value-of select="@nodeJd"/> 
</xsl:attribute> 
<xsl:attribute name="action"> 
<xsl:value-of select="submit/@target"/> 
</xsl:attribute> 
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<xsl:apply-templates/^ 
<p> 

<INPUT TYPE="SUBMir VALUE="@name"/> 

</p> 

</fonn> 

</xsl:wheD> 

<xsl;otherwise> 

<div node Jd=" { @nodeJd } " 

naine=" { @naine } "> 

<xsl:apply-templates/> 

</div> 

</xsI:otha^se> 

</xsl:choosc> 

</body> 

</htnil> 

</xsl:teinplate> 

<xsl:templatematch="cinl[@action=*submitT'> 

<h2> <a name="{@name}"> 

<xsl: value-of seIect="@ritle"/> </a> 

</h2> 

<fomi> 

<xsl:attribute name=-nodeJd"> 

<xsl:value-of select="@nodeJd"/> 

</xsl:attribute> 

<xsl:attribute name="action"> 

<!- for rea, we should process submit node to 

cons up target uri -> 

<xsI:value-of seIect=".7submit/@target"/> 

</xsl:attribute> 

<xsl:apply-templates/>' 

<p> 

<INPUT TYPE="SUBMir' VALUE=-{@name}-/> 

</p> 

</form> 

</xsl:template> 

<xsl:template match="cTnr*> 

<h2 nodcJd=" { (o^ode Jd } 

<a name- ' { @name } "> 

<xsl:valueK)f select="@title"/> <z> 

</h2> 
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<xsl:apply-iemplatcs/> 

<xsl:if test="@action=Vetuni'"> 

<p> 

<a names-fconcatC*'. /cml/^ame)}'> 

Back 

</a> 

</p> 

</xsl:ifi> 

</xsl:template> 

<xsl:tcmplate match="group"> 
<div groupId=" { @groupId } " 
inodality=" { (gimodality ) " 
class="{@class}'> 
<xsl:apply-tempIates/> 
</div> 

</xsl:tcmplate> 

<xsl:template Tnatch="subniit"/> 
</xsl:stylesheet> 

<!--$ld: cnvironmentxslv 1.2 2000/02/01 Exp $ -> 
<!-Description: Process CML environment constructs -> 
<xsl:styledieet 

xmlm:xsl="http://www.w3.org/1999/XSL/rransfoiin"> 

<xsl:template match="final"> 

<xsl :apply-tcmplates/> 

</xsl:template> 

<xsl:template match="var"> 

<input type="hidden" name="{@name}- value- '{@expr}"/> 
</xsl:teniplate> 

<xsl:tennplate match="assign"> 
<input name="{@namc}" type-'hidden'> 
<xsl;attribute name="value"> 
<xsl:choose> 

<xsl:when test="@expr=""> 

<xsl:vaIue-of select=*\/node()"/> 

</xsl:when> 

<xsl:otheTwise> 

<xsl:value-of selcct="@expr"/> 

</xsl:othcrwise> 

</xsl:choose> 
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</xsl:attribute> 

</niput> 

</xsl:tCTnplate> 

<xsl:template matcb="value"> 

<bXxsl:valuc-of select="@naine"/X/b> 

<xsl:template> 

</xsl:stylesheet> 

<!- Sid: modality.xsi,v 1.1 2000/02/05 19:32:00 Exp $ -> 
<!- Description: Process CML modality constructs -> 

- <xsl:styleshe^ xmlns:xsl="http://wwv.w3.org/1999/XSL/Transfo™'> 

- <xsl:teinplate matcb=**inodality|@class='visuar]**> 
<xsl:apply-templates > 

</xsl:templatc> 

- <xsl:template inatch="var-> 

<input type=''*hidden'* names*" {@naine}*' value="{@expr}'' /> 

</xsl:template> 

- <xsl:template matcb='*assign**> 

- <input name=*'{@name}" type="hldden"> 

- <xsl:attribute name="value'*> 

- <xsl:choose> 

- <xsl:wh«i test="@expr='*"> 

<xsl:value-of selcct=*VnodeO" > 
</xsl:when> 

- <xsl:otherwise> 
<xsl:value-of select="@expr" l> 
</xsl:otherwise> 
</xsl:choose> 
</xsl:atiribute> 

</input> 
</xsl:template> 

- <xsl:template match-* value**> 
.<b> 

<xsl:valu©-of select="@name" /> 
</b> 

</xsl:template> 
</xsl:stylesheet> 

<!-$ld: outpui.xsU 1.3 1999/11/12 20:07:23 Exp $-> 

<!- Description: Transformation rules for CML gestures that -> 
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<!- primarily output information — > 

<xsl:stylesheel xmlns:xsl="http://www.w3 .org/1 999yXSL/TransfoTm" 

<xsl:template match="message"> 

<P> 

<xsl:attribute name="nodeJd'> 

<xsl:value-of select=*'^o<ieJd"/> 

</xsl:attribute> 

<xsl:apply-templates/> 

</P> 

<xsl:template> 

<!- eventually generate pop-up help via javascript — > 

<xsl:template match="help'*> 

<P> 

<xsl:attribute name="nodeJd"> 

<xsl:valueK)f select="^ode_id"/> 

</xsl:attribute> 

<xsl:apply-templates/> 

</P> 

</xsl:template> 
</xsl:stylesheet> 

<!-.$ld: selections.xsl,v 1.8 2000/01/31 17:50:34 $-> 
<!-Descriptions: Transform CML selection gestures to HTML -> 
<xsl:stylesheet 

xmlns:xsI=Tittp://www.w3.org/l 999/XSL/Transform"> 
<xsl template match=*'menu"> 
<xsl:iftest="@title!=""> 
<h2> 

<a name="#{@name}**> 

<xsl:value-of selcct="@title"/> 

</a> 

</h2> 

<yxsl:ift> 

<xsl:apply-templates select="message"/> 

<ol nodeJd="{@nodeJd}"> 

<xsl: for-each select="choices/choice|choices/default"> 

<i> 

<ahref="{^value}"> 

<xsl:apply-templates/> 

</a> 
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<yii> 

</xsl:for-each> 
</ol> 

</xsl:tenipIate> 

<xsl:template match="selecf> 

<xsl:apply-teniplates select="message"/^ 

<select name="{(^ame}"> 

<xsl:apply-templates select="choices**/> 

</select> 

<p/> 

</xsl:tcmplate> 

<xsl:teinplate match="choiccs"> 

<xsl:apply-tenipIates/> 

</xsl:template> 

<xsl:teinplate niatch="choice|defauir> 
<option> 

<xsl:attributc name="value"> 
<xsl:value-of select="@value"/> 
</xsl:attribute> 

<xsl:if test="name(.)='default"*> 
<xsl:attribute nan!e="checked"/> 
</xsl:ifi> 

<xsl:apply-tCTnpIates/> 

</option> 

</xsl:tanplate> 

<xsl:template match="grammar*' l> 
</xsl:stylesheet> 

<!-$ld: identity.xsl,v 1.1 1999/11/08 18:05:26 Exp $-> 

<!- Description: Identity transform for use in other sheets~> 

<xsl:stylesheet xmlns:xsl="http://www.w3 .org/1 999/XSL/Transfoim'> 

<xsl:teniplate match="*|@*'> 

<xsl:value-of select="."/> 

<xsl:copy> 

<xsl:apply-teniplates seIect="@*"/> 

<xsl:apply-teniplates select="node()"/> 

</xsl:copy> 

</xsl:template> 

</xsl:stylesheet> 



(e) WML Sources 

[0134] The following describes the WML source page obtained by applying the (CML to WML) XSL sources on the 
CML source page. The resuKing welcome GUI page as viewed with a WML browser is illustrated in FIG. 7. 
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<?xml version="1.0" encoding="utf-8''?> 

<!DOCTYPE wml PUBLIC "-/AVAPFORUM//DTD WML L1//EN" 

"http://www.wapforuin.org/DTD/wfnLl . 1 . 

<wml> 

<template> 

«io type="prev" label="Back-> 

<prev/> 

</do> 

</template> 

<card id="cnn.conunand" title="cnn.command'*> 
<p> 

<select nanie=**cnn.conimand"> 

<option onpick=**#cnn.query">Select News StOTies</option> 
<option onpick="#cnn.exit"> 
Exit </option> 

<option onpick="#cnn.applicationHelp">Help</option> 

</select> 

</p> 

</card> 

<card id=''*cnn.applicationHelp'* tides"cnn.applicationHelp'*> 
<p> 

This application allows you to select and view CNN news stories 

</p> 

</card> 

<card id="cnn.exit" title="cnn.exit"> 
<P> 

Thankyou for using the CNN news service 
<p> 

<p align="center*> 

<a href^"cnn.wmls#submit()"/> 

</p> 

</card> 

<card id="cnn.query" title="cnn.query*'> 
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<p> 

<select name=*'cnn.query"> 

<option onpick="#cnn.query.news*> News </option> 

<option onpick=*'#cnn.query.business''> Business </option> 

<option onpicks"#ciin.query.sports**> 

Sports 

</option> 

<option onpick="#cnn.query.travel"> Travel </option> 

<option onpick="#cnn.query.weather"> Weather </option> 

<option onpick="#cnn.query.show"> 

Show Business 

</option> 

</select> 

</p> 

<;/card> 

<card id="cnn.query.news" ritle="cnn.qucry.ncws*> 
<p> 

Which part of today's news would you like to read?<select nanie-"cnn.query.part-> 

<option value="h" onpick="cnn.wnils#submit()"> Headlines</option> 

<option value-' r onpick="cnn.wmls#submit()"> first story </option> 

<option value='7" onpick="cnn.wmls#submit()"> second story </option> 

<option value="3" onpick="cnn.wmls#submit()"> third story </option> 

</select> 

</p> 

<p> 

Which news category would you like to read? 

<select nanie="cnn.query.interest"> 

<option value="business" onpick="cnn.wnils#submit()"> 

Business 

</option> 

<option value="africa" onpick="cnn.wmls#submit()"> 
AfHca</option> 

<option value="world" onpick="cnn.wmls#submit()"> World </option> 

<option value='*United states" onpick="cnn.wmls#submit()"> United states </option> 

<option value=-europe" onpick=''cnn.wnils#submit()-> Europe </option> 

<option value="Asia" onpick="cnn.unmls#submit()"> Asia</option> 

<option value="me" onpick="cnn.wmls#submit()"> Middle East</option> 

<option value="anierica- onpick="cnn.wmls#submit()"> America </option> 

</select> 

</p> 
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<p align="center"> 

<a href='*cim.wmls#subinit()'*A> 

<p> 

<C3Td> 

<caid id=**cnn.query .business** title='*cnii.query.busine$s"> 
<p> 

Which part of today's news would you like to read?<select name=**cnn.queTy.part**> 

<optioTi value="b" onpick="cnn.wmls#submit()-> Headlines</option> 

<option value="r onpick="cnn.wmls#submit()"> first story </option> 

<option value=-2" onpicks'*cnn.wm1s#submit()'*> second story <;/option> 

<option value='*3*' onpick=**cnn.wmls#subniit()'*> third story </option> 

</select> 

</p> 

<P> 

Which business category would you like to read?<select name="cnn.queiy .interest" 

<option value=**NEWS'' onpick=:'*cnn.wmls#submit{)*'> news </option> 

<option value=*'IN" onpick=*'cnii.wnils#submit()"> indexes </option> 

<ojption value=**CU*' onpick='*CDn.wrols#subniit()'*> exchange rates </option> 

<option value=**MET" onpick="cnn.wmls#submitO-> metals </option> 

</select> 

</p> 

<p align=**center**> 

<a hrel=''cnn.wmls#submit()"/> 

</p> 

</card> 

<card id=**cnn.query.weafter*' title="cnn.query.weathcr**> 
<p> 

Which part of today's news would you like to read?<select name="cnn,query.part**> 

<dption value=**h*' onpick=*'cnn.wnils#submit()*'> Headlines</option> 

<option value="r' onpick="cnn.wnils#submit()**> first stoiy </option> 

<option value='*2" onpick='*cnn.wmls#submit()*> second story </option> 

<option value=-3" onpick='*cnn.wnils#submit()*'> third story </option> 

</select> 

</p> 

<p> 

Which region are you interested in?<select name="cnn.qucry.interest"> 

<option value="us" onpick="cnn.wnils#subTnit()**> United states </option> 

<option value="europe" onpick="cnn.wmls#submit()**> 

Europe 

</option> 
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<option value="JP" onpick="ciin.wmls#submit()**> Japan </option> 
<option valuc="AU" onpick=*'cnn.wmis#submit()"> Australia </option> 
<option value^'^AS" onpick="cnn.wmls#submit()"> Asia </option> 
</select> 
</p> 

<p align="center"> 

<a href="cim.wmls#submit()"/> 

</p> 

</card> 

<canl id- 'crai.queiy .travel" title="cnn.queiy.traver'> 
<p> 

Which part of today's news would you like to read?<sclect name="cnn.query.part"> 

<option valu^"h" onpick="cnn.wmls#submit()"> HeadHnes</option> 

<option valuer- 1" onpick="cnn.wmls#submit()"> first stoiy </option> 

<option value="2" onpick="cnn.wmls#submit()"> second story </option> 

<option value='3" onpick=-cnn.wmls#subinit()'> third story </option> 

</select> 

</p> 

<P> 

Which city do you want to visit?<select nanie="cnn.query.interest"> 

<option value="AMSTERDAM" onpick="cnn.wmls#submitO">AMSTERDAM<option> 

<option value="COPENHAGEN" 

onpick="cnn.wmls#submit()">COPENHAGEN</oplion> 

<option value="HELSINKI" onpick="cnn.wnils#submit()'>HELSINKI</option> 

<option value="HONGKONG" onpick="cnn.wnals#subniit()">HONGKONG</option> 

<option value="LONDON" onpick="cnn.wmls#submit()">LONDON</option> 

<option value="OSLO" onpick="cnn.wnils#submit()*'>OSLO</option> 

<option value="PRAGUE" onpick=''cnn.wmls#subniit()">PRAGUE</option> 

<option value="SINGAPORE" onpick=''cnn.wmls#submit()">SINGAPORE</option> 

<option vaIue="STOCKHOLM" onpick="cnn.wmls#submit(r>STCX:KliOLM</option> 

<option value="SYDNEY" onpick="cnn.wmls#subniit()">SYDNEY</option> 

</select> 

</p> 

<p align='*center"> 

<a href="cnn.wmls#submit()"/> 

</p> 

</card> 

<card id="cnn.query .sports" title="cnn.queTy.sports"> 
<p> 

Which part of today's news would you like to read?<select name="cnn.query.pait"> 
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<option value="h" onpick=''cnn.wnils#subniit()"> Headlmes<;/optioD> 

<optioii value="l " onpick="cnn.wnils#subniitO"> first story </option> 

<option value="2" oiipick="cim.wnils#subniit()'*> second story </option> 

<option value="3** oiipick=**cnn.wmls#subinitO'*> third story </option> 

</select> 

</p> 

<p> 

What sports are you interested in?<select nanie="cnn.queTy.interest"> 

<option value="AS" onpick="cnn.wmls#submit()"> Asia </option> 

<option value="w" onpick="cnn.wmls#submit{)"> world </option> 

<option value="eu" onpick="cnn.wnils#submit()"> europe </option> 

<option valuc="us" onpick="cnn.wmls#submit()"> united states </option> 

<option value="nba" onpick="cnn.wmls#subnriit()"> NBA </option> 

<option value?="nhl" onpick=*'cnn.wnils#submit()"> nhl </option> 

<option value=="EF" onpick="cnn.winls#submit()"> Europoean football </option> 

</select> 

</p> 

<p align="center"> 

<a href="cnn.wmls#submit()"/> 

</p> 

</card> 

</winl> 

(f) Gesture XSL 

[01 35] The following example illustrates the CML to VoiceXML gesture-based XSL rules that are used to transcode. 
gesture by gesture, a CML page into a VoiceXML page. All the gesture based transcoding rules required to transcode 
any possible CML page are not present. It is to be considered as an illustration of the method. The XSL syntax follows 
conventional XSLT rules, see, e.g.. http://www.w3.org/1999/XSL/Transfomi. 

<!- cinl2wnil.xsl -> 

<xsl:stylesheet version="I.O" xnilns:xsl="htip://www. w3 .org/1 999/XSL/Transfomi"> 
<!- 

<xsl:output ntethod="htmr' indent="yes''/> 
~> 

<xsl:output niethod="xnil" indent="yes" niedia-type="text/xml"/> 

<xsl:tennplate matcb=7cmr'> 

<xsl:text disable-output-escaping="yes*'> 
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<!DOCTYPE wml PUBLIC "-/AVAPFORUM//DTD WML L1//EN" 

"http://www.wapfonim.org/DTD/winLl 

</xsl:text> 

<wml> 

<teniplate> 

«lo type="prev" label="Back"> 

<prev/> 

</do> 

</template> 

<xsl:appIy-templates/> 

</wml> 

</xsl:tanplate> 

<xsl:template match="cml'> 

<xsl:choose> 

<xsl:whai test=**menu"> 

<!- to avoid <card><card>..<;/cardx/card> — > 

<card> 

<xsl:attribute name="id"> 
<xsl:value-of select="@name"/> 
</xsl:attribute> 
<xsl:attribute naine="title"> 
<xsl:value-of select="@name"/> 
<xsl:attribute> 
<pxselcct> 

<xsl:attribute name="naine"> 
<xsl:value-of select="menu/@name"/> 
</xsl:attribute> 

<xsl:apply-tcinplates select="menu/message"/> 

<xsl:for-eacb select="nienu/choices/choice | inenii/choices/default"> 

<option> 

<xsl:attribute name="value"> 
<xsl:valuc-of select="@value"/> 
</xsl:attribute> 

<xsl:attribute name="onpick">#<xsl:value-of select="@value"/x/xsl:attribute> 

<xsl:call-teniplate name=-lex"/x/option> 

</xsl:for-each> 

<;/select> 

</p> 

</card> 

</xsl:when> 
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<xsLotherwise> 
<card> 

<xsl:attribute name- •id*'> 
<xsl:value-of select="@name"/> 
<Vxsl:attributO 
<xsl:attribute name="title"> 
<xsl:value-of select="@name*'/> 
</xsl:attribute> 
<xsl:apply-teinplates/> 
</card> 

</xsl:otherwise> 

</xsl:choose> 

</xsl:template> 

<xsl:template match- ctnl[^action='*subinit**]'> 
<card> 

<xsl:attribute name="id"> 

<xsl:valu&-of select="@name"/> 

</xsl:attribute> 

<xsl:attribute name="tide-> 

<xsl:value-of seIect="@name"/> 

</xsl:attribute> 

<xsl:appIy-templates/> 

<p align="center''> 

<a> 

<xsl:attribuie name='*href '> 

<xsI:valueK)f select^'7cml/@name'*/>.wmls#submit()<xsl:attribut^ 

</a> 

</p> 

</card> 

</xsl:teniplate> 

■<xsl:template match='select> 

<P> 

<xsl:apply-templates select=-message"/> 
<select> 

<xsl:attribute name="name"> 
<xsl:value-of select="@name"/> 
</xsl:attribute> 

<xsl:for-each select="choices/choice | choices/default"> 
<option> 

<xsl:attribute name=''value"> 
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<xsl:value-of sclcct="@value"y> 
</xsl:attribute> 

<xsl:attribute name="onpick*> 

<xsl:vahicK)fselect="/cmV^^name7>.wmls#submit()<;/xsl:a^ 

<xsl:call-template name="lex"/></option> 

</xsl:for-each> 

</select> 

</p> 

</xsl:teniplate> 
<xsl:template maich="menu"> 
<card> 

<xsl:attribute naine=*'id"> 

<xsI:value-of select="@naine"/> 

</xsl:attribute> 

<xsl:attribute naine="title"> 

<xsl:value-of select="@naine"/> 

</xsl:attribute> 

<p> 

<select> 

<xsl:attribute naine="name*> 
<xsl:value-of select="@namc"/> 
</xsl:attribute> 

<xsl:apply-templatcs select="message"/> 
<xsl:for-each select^^^choices/choice | choices/defaulf^ 
<option> 

<xsl:attribute naine="value"> 
<xsl:valueof select="@value"/> 
</xsl:attribute> 

<xsl:attribute name="onpick">#<xsl: value-of select="@valuc"/X/xsl:attribute> 

<xsl:call-teniplate name="lex"/></option> 

-</xsl:for-each> 

</select> 

</p> 

</card> 

</xsl:templaie> 
<xsl:template naine='*lex"> 
<xsl:for-each select="node()"> 
<xsl:if test="position( )=last()"> 
<xsl:vaIue-of select=-cuTrcnt()"/> 
</xs!:if> 
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</xsl:for-each> 
</xsl:template> 

<!- explicitly remove segment ~> 
<xs!:template match="submit-/> 
<xsl: template match="message"/> 
</xsl:stylesheet> 

(g) XSL Source to produce VbiceXML 

[01 36] The following describes the XSL source code used to produce the VolceXML source page. 

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/rransform"> 
<xsl:output method="html"/> 
<xsl:template match=7cml"> 
<vxml> 

<xsl :apply-templates/> 

</vxml> 

</xsl:template> 

<xsl:template match="menu''> 

<menu> 

<xsl:apply-tcmplates select=*'message"/> 
<xsl:attribute names^id'^ 
<xsl:value-of select="@name''/> 
</xsl:attribute> 

<xsl:attribute name- *nodeJd''> 
<xsl:value-of select="@nodeJd"/> 
</xsl:attribute> 

<xsl:apply-templates select="message"/> 
<prompt> Say one of <enumerate/> </prompt> 
<xsl:for-each select="choices/choice|choices/default"> 
<choice> 

<xsl:attribute name="nexr>#<xsl:value-of select="@value"/X/xsl:attribute> 

<xsl:apply-templates/> 

</choice> 

</xsl:for-each> 

</menu> 

</xsl:template> 

<xsl :template match="cml[(^ction='retum*]"> 
<fonn> 
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<xsl:attribute name="id"> 
<xsl:value-of select="@naine"/> 
</xsl:attribute> 

<xsl:attribute namc="n(xlejd*> 
<xsl:value-of select="@nodeJd''/> 
</xsl:attribute> 
<xsl:apply-templates/> 
<blockxgoto> 

<xsl:attribute namc="next">#<xsl:value-of selcct="/cinl/nienu/@name"/x/xsl:attribut 

</gotoX/block> 

</fonn> 

</xsl:template> 

<xsl:template niatcb="cinl[@[action=*siibmit']'*> 
<foTm> 

<xsl:attribute naine="id"> 
<xsl:value-of select="@nanie''/> 

</xsl:attribute> 

<xsl:attribute naine="nodeJd*> 
<xsl: value-of select="@nodeJd"/> 
</xsl:attribute> 
<xsl:apply-templates/> 
<blodc> 

<goto next=''http://raman.alrnadenJbm.com/cgi-bin/cnii.cgi"> 

<xsl:if test="select[@name]*> 

<xsl:for-each select="selea"> 

<xsl:attribute names^'submif^ 

<xst:value-of select="@name"y> 

</xsl:attribute> 

</xsl:for-eadi> 

</xsl:ifi> 

</goto> 

</block> 

</form> 

</xs!:template> 

<xsl:templatc inatch="selecf> 

<field> 

<xsl:attribute name="name"> 
<xsl:value-of seIect="@name"/> 
</xsl:attribute> 

<xsl:attribute nanie="nodeJd"> 
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<xsl:value-of sclect=".7@nodeJd''/> 
</xsl:attribute> 
<xsLif test="message"> 
<prompt> 

<xsl:vaIue-of select="message"A> 

Say one of <enumerate/> 

</prompt> 

</xsl:ifi> 

<grammar> 

<xsl:for-each select="choices/choice|choices/default"> 

<xsl:call4emplate name="lex"/> 

<xsl:if test="foliowing-sibling::choice'>|</xsl:ifi> 

</xsl:for-cach> 

</graminar> 

</field> 

</xsl:template> 

<xsl:teniplate match="mcssage*> 

<fieldxpron^t> 

<xsl:attribute name="node_id"> 

<xsl:value-of select="@nodcJd"/> 

</xsl:attribute> 

<xsl:apply-templates/> 

</pronipt> 

</field> 

</xsl:template> 

<xsl:template matcb='*help'*> 

<help> 

<xsl:attribute name="nodeJd"> 
<xsl:valucK)f select="@nodeJd"/> 
</xsl:attribute> 
<xsl:apply-templates/> 
<;/helFP' 

</xsI:template> 

<xsl:template inatch="grammar"/> 
<xsl:teinplate match="subinit"/> 
<xsl:template name="lex"> 
<xsl:for-each select="nodc()"> 
<xsl:if test="position()=last()"> 
<xsl:value-of select=-current07> 
</xsi:ifi> 



</xsl:for-each> 
</xsl:template> 
</xsl:stylesheet> 
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(h)VoiceXML Sources 

[01 37] The following describes the VoiceXML source page obtained by applying the (CML to VolceXML) XSL sources 
on the CML source page. The resulting welcome Speech dialog as presented by a VoiceXML browser Initiatly presents 
the user with a dialog to select by voice between the different options. 



<vxnil> 

<menu id=**cnnjcommand" node_id="2"> 

<prompt> Say one of <enumerate></enumerate></prompt><choice 

next="#cnn_query">Select News 

Exit </choiceXchoice next="#cim_applicationHelp*'>Help</choice> 
</inenu> 

<form id- •cnnjapplicationHelp" nodeJd="4"> 
<fieldxpTompt nodeJd="5"> 

This application allows you to select and view CNN news stories 
</promptx;/field> 

<blockxgoto next=''#cnn-></gotoX/block> 
</forn)> 

<fonn id="cnn.exit" nodeJd="6"> 
<fieldxprompt nodejd="60"> 
Thankyou for using the CNN news service 
</promptx/field> 

<blockxgoto next=''bttp://ranian.alinadenj'bm.coni/cgi-bin/cnnxgi"></goto>^ 

</fonn> 

<menu id="cnn_queTy" node_id="ir*> 

<prompt> Say one of <cniiineratex/enumerate><promptxchoice 

next=*'#cnn_jqucry_news"> News < 

Sports 

</choicexchoice next="#cnnjc|ucryjtraver'> Travel </choicexdioicc ncxt="#cnn_qu 

Show Business 

</choice> 

</menu> 

<fonn id="cnn_queryjnews" node_id- '13"> 

<field nanie="cnn_queryj}art'* nodeJd="13"><prompt> 
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Which part of today's news would you like to read?</pToinptxgrainmar> HeadlinesI 

<field naine="cnnjqueryjntercst" node_id="13**xproTnpt> 

Which news category would you like to read? 

</promptXgranunar> 

Business 

I 

Africal World | United states | Europe | Asia| Middle East) America </granima 
<blockxgoto next="http://rainan.almaden.ibm.com/cgi-bin/cnn.cgi'' 
subnut="cnnjquery_int 
</fonn> 

<fonn id="cnn_queryjbusiness" nodejid="16"> 

<field name="cnn_queiyj)art" node_id="16"xprompt> 

Which part of today's news would you like to read?</promptxgrainniar> HeadlinesI 

<field name="cnn_quCTyJnterest- nodeJd=" 16"xprompt> 

Which business category would you like to read?</promptxgrainmar> news 1 indexes 

<blockxgoto next="http.7/raman.alniaden.ibm.com/cgi-bin/cnn.cgi" 

submit="ain_query_int 

</form> 

<forni id="cnn_query_weather" nodeJd=''19"> 

<ficld naine="cnn_queryj)art" nodeJd=" 1 9"Xprompt> 

Which part of today's news would you like to read?</proinptXgFanunar> Headlines) 

<field nanie="cnn_queryjnterest" nodeJd="19'*Xprompt> 

Which region are you interested in?</promptxgrainmar> United states | 

Europe 

I Japan | Australia | Asia </graiiunarx/ficld> 
<blockxgotonext="http://ranian.alrnaden.ibm.com/cgi-bin/cnn.cgi*' 
submit="cnnjciueryjnt . 
</form> 

<fonn id="cnn_query_traver' nodeJd="22"> 

<field nanie="cnn_queryj)art" nodeJd=*'22"xprompt> 

Which part of today's news would you like to read?</promptxgrammar> Headlmesj 
<field name="cnnjqueryjnterest" nodeJd="22"><prompt> 
Which city do you want to 

visit?<7promptxgrammar>AMSTERDAM|COPENHAGEN|HELSlNKI| 

<blockxgotonext="http://raman.almadai.ibm.com/cgi-bin/cnn.cgi- 

submit="cnn_jqucryjnt 

</form> 

<form id="cnn_qucTy_sports" nodeJd="25"> 

<ficld name="cnn_queryj)art" nodejd- •25"xprompt> 

Which part of today's news would you like to read?</pToniptXgrammar> HeadlinesI 
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<field name="cnn_qucryjnterest" node_id="25"xproiTipt> 

What sports are you interested in?</pn>inptxgrammar> Asia | world | europe | uni 

<blockxgoto next="fattp://ranm.almadenJbm.coni/cgi-bii]/cnn.cgi*' 

subiTiit="cnn_quCTyJnt 

</fonn> 

</vxml> 

(iii) Tight muttHmodal browsing and multklevice browsing 

[01 38] As described above and in more detail below* the different modalities can be tightly synchronized. For example, 
it can be voice and GUI on a same device or voice on the telephone synchronized with GUI on a HTML or a WMI 
browser, etc. 

M. Cosmetlzation 

[01 39] Modality specific cosmetic content or parameters can be added using modality specific XML syntax. Modality 
specific gestures can be added using modality specific XML syntax with modality qualifiers. Other modality can ignore 
or replace these components by others (e.g., by captions). 

(I) Modality Specific Information 

[0140] CML is designed to be a declarative, modality-independent markup language for specifying interaction logic 
and conversational application flow. However, we realize that, in the interim, application authors will want to add mo- 
dality-specific content to CML applications In order to achieve custom presentations. CML permits this by element 
modality which is used to encapsulate snippets of markup that are intended for use in a specific modality. Note that 
such modality-specific snippets will only appear in the specified modality; authors are therefore encouraged to use 
such nrodality-specific snippets only where it Is deemed absolutely necessary, and further where the author either 
provides an alternative snippet for use in other modalities, or does not care about any other modality. Element modality 
qualified by XML attributes class and module is defined below: 

Class: Specifies class of modalities to which this snippet applies. 

module: Specifies the markup language modules that can accept this snippet 

[0141] The following is an HTML-specific snippet that will be passed through to the visual representation. 

<modality class="visual" module="html-basic-> 

<LINK REL="styIesheet" 

HREF="cim.css- 

TYPE="text/css"/> 

<;/inodality> 

[0142] The following is an example of a cosmetized CML page: 
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<!-$Id: cim.cml,v 1.21 2000/02/05 20:08:27 Exp $-> 
<!-Description: CNN Mobile In anl -> 

<cml nanie=*'cnn" 
nodeJd="r 

titlc=-CNN Mobile New5'> 
<modality class="visual" niodtde=='*html-basic**> 
<LINK REL="styleshect- 
HREF="cnn.css- 
TYPE="text/css-/> 
</modality> 

<inodality classs^-visual" module="html"> 
<TABLE BORDER="0" WIDTH=-600" CELLSPACING="0'' 
CELLPADDING="0"xTR> 

<TD WIDTH=-l22"VALlGN=TOP''xa H 

<IMG SRC="http://cnnxom/iniages/1 999/10/cnnstore.gir 
WIDTH=-120" HE1GHT="60" BORDER^"!" AL 

<TD W1DTH="8" VALlGN="TOP"Xa HREF="htip://cnnxom/ads/ 
e.maiket/"> 

<1MG SRO="http://cnn.coni/images/l 998/05/homepage/ad. 
info.gir W1DTH="7" HE1GHT="62" BORDER== 
<TD WIDTH=*'470" VALlGN="TOP-> 
<a 

HREF= " http: / /cnn. coni/eventng/Type=click%26RunID= 
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1 1875%26Prof ileID=34%2 6AdID=13042%2 6Group: 
target="_top-> 

<img src="ht^://cnn.com/ads/advertiser/promo/ 
interconipany_onair/9907/onair_eggjcnn.gifr 

bordei="0" height="60- width="468" alt="Get to the point 

news!"/> 
</a> 

<table widA="IOO%" cellpadding=-0- cellspadng="0- 

border="0"xtrxtd align="rigbt"Xfont 
face="verdana»ARlAL,sans-scrif • si2e="l"xa 

</TDx/TRX/TABLE> 
</modality> 

<niodality class="speedi" module="vxml'> 
<block> 

Shop CNN for all your information needs! 
</block> 
</modality> 

<nienu nanie=**cnnxomniand" 
nodeJd="2"> 
<choices nodeJd="3" > 

<default value=--#cnn.query''>Select News Stories</default> 
<choice value="#cnn.cxit*' 

require_confiTniation-'tnie*> 
Exit </choice> 
<choice value="#ain.applicationHelp">Help</choice> 
</choices> 

<mem> 

<ctn] name="cnn.applicationHelp" 
tiUe="About CNN Mobile" 
« nodeJd="4" 

action="retum*'> 
<message 

nodeid="5"> 

45 

[0143] This application allows you to select and view CNN news stories 

</message> 

50 <cml> 

<anl nanie="cnn.exit" 
nodeJd="6" 
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titIe="Exit CNN Mobile News" 
action="subinit"> 
<message nodejd=''60*> 

Thankyou for using the CNN news service 
</message> 
<cml> 

<group nodeJd="7" 

groupId="query-> 
<cml name="cnn.quciy" 

title="Search CNN Mobile News" 
nodeJd="58'> 
<menu name^^cnn.query.topic" 
nodeJd="ir 
title='Topic Selection-> 
<choices nodeJd=''I2" > 
<cboice value="#cnn.query,news"> News </choice> 
<cboice value="#cnn.quary.business**> Business </dioice> 
<choice value="#cnn.query.sports"> 

<grairanar> (sport | sports" </grammar> 

Sports 

</choice> 

<choice value="#cnn.quey.traver> Travel </choice> 
<choice value=-#cnn.queiy,weather"> Weather </choice> 
<choice value="#cnn.qucry.show"> 

<grammar > show [business] </grainmar> 
Show business 
</choice> 
</choices> 
</menu> 
</cml> 

<cml nanie="cnn.queTy.news^' 

title=-News Channel" 

nodeJd="13" 

action="submit"> 
<select nanie=''cnn.query.part"> 

<message node_id="9'' > 

Which part of today's news would you like to read?</message> 

<choices 

nodejd="10"> 
<choice value- •h"> Headlines<;/choice> 
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<choice value- 'r> first story </choice> 
<choice value="2"> second story </choice> 
<choice value="3"> third story </choice> 
</choices> 
</select> 
<select name="cnn.queTy.intercst*'> 
<message node_id='' 1 4"> 

Which news category would you like to read? 
</message> 

<choices node_id=*'15" > 
<choice value="business"> 
<gramniar type="text/jsgr> 
business {BIZ}</gramniar> 
Business 
</choice> 

<choice value="afnca*'> 
Afnca</choioe> 
<choice value="world"> World </choice> 
<choicc value="United states"> United states </choice> 

<choice value="europe"> Europe </dioice> 
<choice value="Asia"> Asia<ychoice> 
<choice value="nae"> Middle East</choice> 
<choice value=''aincrica"> America </dioice> 
</choices> 
</select> 
<cml> 

<cml name="cnn.query business** 
title=**Bu5iness Channel" 
action="subniit" 
nodeJd=**16**> 
<select nanie="cnn.query.part**> 
<message nodeJd=**9" > 

Which part of today* s news would you like to read?</message> 
<choices 

nodejd=**10"> 
<choice value=*'h"> Headlines</choice> 
<choice value="r> first story </choice> 
<choice value="2*'> second story </choice> 
<choice value="3"> third story </choice> 
</choices> 
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</selcct> 

<select nanie="cnn.queTy.interest*'> 
<message nodcJd=**l 7"> 

Which business categoiy would you like to read?</message> 
<choices nodeJd="18"> 

<choice value="NEWS'> news </choice> 

<choice value="lN''> indexes </choice> 

<choice value="CU"> exchange rates </choice> 

<choice value="MET"> metals </choice> 
</choices> 

</select> 
</cml> 

<cnil name="cnn.query.weather" 

title="Weather Channel" 

action-*subniit" 

nodeJd=-19" > 
<select nanie="cnn. query .pait''> 
<message node_id="9'' > 

Whidi part of today's news would you like to read?</inessage> 
<choices nodeJd=" 1 0" > 

<choice value- •h''> Headlines</choice> 

<choice value="l"> first stoiy </choice> 

<choice value="2"> second story </choicc> 

<choice value="3"> third story </choice> 
</choices> 
</select> 

<select name- *cnn.query.interest"> 
<message nodejd="20"> 
Which region are you interested in?<Vmessage> 
<choices nodeJd-*2r'> 
<choice value="us"> United states </choice> 
<choice value="euTope''> 
<grammar type="text/jsgf> (euro | Europe) </grammar> 
Europe 
</choice> 

<choice vaIue="JP"> Japan </choice> 
<choice value="AU"> Australia </choice> 
<choice vaIue='*AS*'> Asia </choicc> 
</choiccs> 
</select> 
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</cml> 

<cml name="ain.query.travcr' 
title=n"ravel Section" 
action="submit" 
nodeJd="22"> 
<select name="cnn.query.part"> 
<inessage node_id="9*' > 

Which part of today's news would you like to read?</message> 
<choices 

nodejd="10"> 
<choice value="h"> Headlines</choice> 
<choice value=**l "> first story </choice> 
<choice value="2*'> second story <;/choice> 
<choice value="3"> third story </choice> 
</choiccs> 
</select> 

<select name=*'cnn.query.interesl"> 
<message nodeJd="23'*> 
Wbidi city do you want to visit?</message> 

<choices nodcJd="24"> 
<choicevalue="AMSTERDAM">AMSTERDAM</choice> 
<choice value="COPENHAGEN">COPENHAGEN</choice> 
<choice value="HELSINKl">HELSINK]</choice> 
<choice value="HONGKONG">HONGKONG</choice> 
<choice value="LONDON">LONDON</choice> 
<choice value="OSLO">OSLO</choice> 
<choice value="PRAGUE">PRAGUE</choice> 
<choice value="SINGAPORE">SINGAPORE</cboice> 
<choice value="STOCKHOLM">STOCKHOLM</choice> 
<choice value="SYDNEY">SYDNEY</choice> 

</choices> 

</sclect> 

</cnil> 

<cml nanne="cnn.queiy .sports" 

action-*submit" 

title="Sports Channel" 

nodeJd='*25"> 
<sel6Ct name- 'cnn.quefy.part"> 

<Tnessage nodeJd="9"> 

Which part of today's news would you like to read?</message> 
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<cboices 

nodejd="10"> 

<choice value="h"> Headlines<;/choice> 

<choice value="r'> first stoiy </choice> 

<choice value="2"> second stoiy </choice> 

<choice value="3"> third story </choice> 
</choiccs> 
</select> 

<select naine="cnn.query.interest"> 

<niessage nodcJd="26*> 

What sports are you interested iD?</niessage> 
<choices nodeJd="27'> 

<choice value="AS"> Asia </choice> 

<choice value="w"> world </dioice> 

<choice value="eu"> europe </choice> 

<choice value=*'us*> united states </dioice> 

<choice value="nba"> NBA </choice> 

<choice value="nhr*> nhl </choice>. 

<choice value="EF"> Europoean foodiall </choice> 
</choices> 
</selea> 
</cmI> 

<subTnit target="http://raman.almaden.ibm.coniycgi-bin/cnn.cgi*'> 
<niessage nodcJd="28"> 

executing <value naine^"ciu).coinmand"/> 

for <value nanie="cnn.queTy.part"/> 

stories about <value name="cnn.queTy.interest"/> 

from topic <value nainc="cnn.query.topic"/> 
</message> 

<env name="cnn.command"/> 

<CTv name="cnn.query.topic7> 

<env name- 'cnn.query.interest"/> 

<aiv name="cnn.query.part''/> 
</submit> 
</group> 

<submit target=--1ittp://ranian.alinaden.ibrn.coni/cgi-bin/cnn.cgi"> 

</submit> 

</cml> 



[0144] The following describes the HTML source page obtained by applying the (CML to HTML) XSL sources on the 
HTML cosmetized CML source page. The resulting welcome GUI page as viewed with a HTML browser Is illustrated 
in FIG. 8. The cosmetization is clearly visible when compared to the non-cosmetized page. This illustrates the possibility 
to cosmetize, at will, the page. Again, all cases have not been considered but this clearly illustrates the approach. 
[0145] The following is the code associated with the cosmetized resulting HTML source page: 
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<!DOCTYPE html PUBL1C"-//W3C//DTD HTML 4.0 Transitional//EN"> 

<htinl> 

<head> 

<META http-equiv="Content-Type- content="text/html; charset=iso 
-8859- r> 

<tide>CNN Mobile News-C/titlO 

</head> 

<body> 

<a name="ciin">CNN Mobile News<a> 

</hl> 

<:div nodeJd="l" name- 'cnn"> 

<LINK REL="stylesheca" HREF="cnn.css" TYPE="text/css*> 

<rTABLE BORDER="0- WIDTH="600" CELLSPACING="0" 
CELLPADDING=-0-> 
<TR> 

<TD WIDTH=-122"VALlGN='TOP-Xa HREF="http://cgi.cnii.com/cgi.bin/redir 
ect?cnn_store-> 

<IMG SRC="ht^://cnn.com/images/1999/10/cnnstore.gir WIDT 
H="120- HEIGHT="60" BORDER=-r' ALT="CNN Store"X/aX7TD> 

<TD WIDTH="8" VALlGN="TOP"xa HREF="ht^://cnn. com/ads/ 

e.maricet/"> 

<IMG SRC="http://cnn.com/images/1998/05/homepage/ad.info.g if* 
W1DTH="7" HE1GHT="62" BORDER="0" ALT="ad info-x/aXHT^ 
<TD WIDTH="470" VALIGN='TOP"> 
<a HREF="http://cnn.com/event.ng/Type=click%26RunlD=l 1875%26 
ProfileID=34%26AdID=l 3042%26GroupID= 1 5%26FamilyID=l 099%26TagValues=4.8. 
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249.435,594.6%%26Redircct=http:%2P/a2Fww.cim.com%2FH^ 
" target='Ltop"> 

<img src=''http://cim.coin/ads/advertiser/proino/inter 
company_onair/9907/onair_eggjcnn.gif • bordei=-0" hcight="60- width="468" alt= 
"Get to the point news!"> 

</a> 

<table width="100%" cellpadding='*0" censpacing="0" border="0"> 
<tr> 

<td align="right"xfont face=-verdanaARlAL,sans-seriP' size^TXa 
href="http://cnn.com/cventng/Type==clidc%26RunID= 1 1 875%26ProfileID= 
34%26AdlD=13042%26GrouplD=15%26FamilylD=1099%26TagValues=4.8.249.435. 
594.60 6%26Redirect=http:%2F%2Fwww.cnn.com%2FHLN%2Fi 
targct=jtop" 

>Get to the point news!</aX/font></td> 

</tr> 

</table> 

<nD> 

</TR> 
</TABLE> 

<ol nodeJd=''2"> 
<li> 

<a href="#cnn.query">Select News Stories<;/a> 

</li> 

<i> 

<a hTef^"#cmi.exit"> 
Exit </a> 

<ni> 
<ii> 

<a href="#cnn.applicationHelp">Help</a> 

</li> 

</ol> 

<h2 nodeJd="4"> 

<a naine="cnn.applicationHelp**>About OJN Mobile</a> 
</h2> 
<P nodeJd="5"> 
This application allows you to select and view CNN news stories 
</P> 
<p> 

<ahref="#cnn'> 
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Back 
</a> 

</p> 
<h2> 

<a naine=''cnn.exit">Exit CNN Mobile News</a> 
</h2> 

<fonn node_id="6" action="http://raman.almaden,ibin.com/cgi-bin/ 
cnn,<^"> 
<P nodeJd="60-> 

Thankyou for using the CNN news service 
</P> 
<p> 

<INPUT TYPE="SUBMIT- VALUE="cnn.exif> 

</p> 

</fonn> 

<div groupld="query" modality^"" class=""> 

<h2nodeJd="8"> 
<a name="cnn.query">Searcb CNN Mobile News</a> 
</h2> 
<h2> 

<a name="#cnn,query.topic">Topic Selection</a> 
</h2> 

<ol nodeJd="ir> 
<li> 

<a hrel="#cnn.queiy.news"> News </a> 

<ni> 

<li> 

<a href="#cnn.qucry.business"> Business </a> 

</li> 

<ii> 

<a href="#cnn.query.sports"> 

Sports 
</a> 

</li> 
<i> 

<a href="#cnn.queTy.traver'> Travel </a> 

</li> 

<li> 
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<a href="#cim.queiy,weather"> Weather </a> 

</li> 

<li> 

<a href="#cnn.query.show"> 

Show Business 
</a> 

</h> 
</ol> 

<h2> 

<a name="cnn.query.news">News Chaiinel</a> 
</h2> 

<fonn node_id="13*' action='*http://rainan.alrna<ien.ibm.coin/cgi-biii/ 
cim.cgi"> 

<P nodeJd="9"> 

Which part of today's news would you like to read?<;/P> 
<select name="cnn.qucry.part"> 

<option value="h"> Headlines</option> 
<option value='1"> first story </option> 
<option value="2"> second story </option> 
<option value="3"> third story </option> 
</select> 

<P> 
</p> 

<P nodeJd="14"> 

Which news category would you like to read? 
</?> 

<select name="cnn.queTy.interest*'> 
<option value='*business"> 

Business 
</option> 

<option value=-africa"> 
AfTica</option> 

<option value="world"> World </option> 

<:option value="United states"> United states </option> 

<option value="europe"> Europe </option> 

<option value='*Asia"> Asia</option> 

<option value="me"> Middle East</option> 
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<option value="america"> America </option> 
</select> 

<p> 
<p> 
<p> 

<INPUT TYPE="SUBMIT- VALUE="cnn.queiy.news-> 
</p> 
</fonn> 
<h2> 

<a Daine=**cnn.query.business">Business Channel</a> 

<na> 

<fonn nodejid- '16" action=='lit^://rainaii.almaden.ibiTi.coin/cgi-bin/ 
cnn.cgi"> 

<P nodeJd="9-> 

Which part of today's news would you like to read?</P> 
<select name="cnn.query.part'*> 

<option value="h"> Headlines</option> 
<option value-1"> first story </option> 
<option value="2*'> second story <yoption> 
<option value="3"> third story </option> 
</select> 

<p> 
</p> 

<PnodeJd="I7"> 

Which business category would you like to read?</P> 
<select name="cnn.query.interest"> 

<option value="NEWS"> news </option> 
<option value="IN"> indexes </option> 
<option value="CU"> exchange rates </option> 
<option value="MET'*> metals </oplion> 
</select> 

<P> 
</p> 

<P> 

<INPUT TYPE="SUBMiT" VAHJE="cnn.query.business"> 
</p> 
</form> 
<h2> 

<a name="cnn.query.weather">Weather Channel</a> 
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</h2> 

<fonn nodeJd="19" action="ht!p://raman.aImaden.ibm.coin/cgi-biii/ 
cnn.cgi'> 

<P nodeJd="9-> 

Which part of today's news would you like to read?</P> 
<select nanie="cnn.query.part"> 

<option value="h"> Headlines</option> 
<option value="l"> first story </option> 
<option value="2"> second story </option> 
<option value="3**> third story </option> 
</select> 

<p> 

</p> 

<P nodejd="20"> 

Which region are you interested in?</P> 
<select name="cnn.query.interest"> 

<option value="us"> United states </option> 

<option value="europe"> 

Europe 

</option> 

<option value="JP"> Japan </option> 
<option value=" AU"> Australia </option> 
<option value=''AS'*> Asia <optioji> 
</select> 

<p> 
</p> 
<p> 

<1NPUT TYPE="SUBMIT" VALUE="cnn.query.weather"> 
</p> 
<fom> 
<h2> 

<a name=-cnn.queTy.traver'>TraveI Section</a> 
</h2> 

<foTTn nodeJd="22" action="http://raman.almaden.ibm.coin/cgi-bin/ 
cnn.cgi"> 

<P nodeJd="9"> 

Which part of today's news would you like to read?</P> <select 
name="cnn.query.part"> 
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<option value="h"> HeadHnes</option> 
<option value="r> first stoiy </option> 
<option value='*2"> second stoiy </option> 
<option valuc="3"> thiid story </option> 
</select> 

<P> 
</p> 

<P nodeJd="23"> 
Which city do you want to visit?</P> 
<select naine="cnn.queTy.interest"> 

<option value="AMSTERDAM">AMSTERDAM</option> 
<option value="COPENHAGEN''>COPENHAGEN</option> 
<opUon value="HELSINKI">HELSlNKl</option> 
<option value="HONGKONG">HONGKONG</option> 
<option valuc="LONDON">LONDON</option> 
<option value="OSLO">OSLC)</option> 
<option value="PRAGUE">PRAGlJE</option> 
<option value=*'SINGAPORE->SINGAPORE</option> 
<option value="STOCKHOLM">STOCKHOLM</option> 
<option value="SYDNEV>SYDNEY</option> 
</select> 

<P> 
</p> 

<p> 

<INPUT TYPE=*'SUBMIT" VALUE="cnn.query.traver> 
</p> 
</fonii> 
<h2> 

<a name="cnn.queiry.sports">Sports Channel</a> 
</h2> 

<fom nodeJd="25" action='**http://rainan.ahiiaden.ibm.coTn/cgi-bin/ 
cnn.cgi"> 

<P nodeJd="9"> 

Which part of today's news would you hlce to rcad?</P> 
<select name="cnn,query.pait"> 

<option value=-h"> Headlines</option> 
<option value="l"> first story </option> 
<option value="2"> second stoiy </option> 
<option value=-3"> third stoiy </option> 
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</select> 

<p> 

</p> 

<P nodeJd="26"> 
What sports are you interested in?</P> 

<sel6Ct naine="cnn.queTy.interest"> 

<option value="AS"> Asia </option> 
<option value="w"> world </option> 
<optioD value=**eu"> europe </optioTi> 
<option value=*'us"> united stales </option> 
<option value="nba"> NBA <l option> 
<option value- 'nhr> nhl </option> 
<option value="EF"> Europoean football </option> 
</select> 

<p> 
<Jp> 
<p> 

<INPUT TYPE="SUBMir* VALUE=" cnn.query.sports"> 

</p> 

</fonn> 

</div> 

</body> 
</html> 



N. CML DTD-Document Type Definition 

[014q The following represent the CML DTD. It is to be understood that the following DTD description should be 
fully understood by anybody familiar with the art of XI^L. It fully defines the syntax of CML as presented for this em- 
bodiment. 



<!-.$Id: cml.dtd,v 1.14 2000/03/02 17:04:02$ -> 

<!- DTD For Conversational Markup Language CML -> 

<!— Conventions: 

Tags are all lower case. 

Attribute names are all lower case. —> 

<!- {attribute entities -> 
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<!- core attributes common to most elements 

nodejd document-wide unique id 

name Names data item that is populated by this gesture. 

title Human readable title 

style URI of custom stylesheet 

— > 

<!ENTITY % coreattrs 
-nodejd ID ^IMPLIED 
name CDATA #IMPLIED 
style CDATA ; #1MPL1ED 
trigger CDATA #implied 
tide CDATA IMPLIED" 
> 

<!-}-> 

<!- {entities -> 

<!ENTITY % GESTURE "(cml 

I select 

I menu 

I message 

I help)''> 

<!^ } -> 

{ TOP LEVEL CML -> 
<!ELEMENT group ( 
%GESTURE+) 
> 

<!ATTLIST group 

id ID #required 

modality CDATA #implied 

class CDATA #impii6d 

> 

<!ELEMENT CML ( 

(group I %GESTURE)+, 

submit? 

) 
> 

<!ATTLIST cml %coreattr> 

<!- > "> 

<!— {gesture message 

<! ELEMENT message ANY> 

<!ATTLIST message %coreattr> 



78 



EP 1 100 013 A2 



<!- } 

<!- {gesture help 
<!ELEMENT helpANY> 
<!ATTUST help %corcattr> 
<!- } -> 

<!- {gesture boolean 

<!ELEMENTbooIean( 

message, 

help?) 

> 

<!ATTLIST boolean %coreattr, 
require_confirmation (true | fake ) #implied 
require.confiimationJLyes (true | false #implied 
requirejconfirmation Jfno (^nic | false #iniplied 
default (true | false #iniplied 
> 

<!- } -> 

<!- {gesture select 

<!ELEMENT error ANY> 

<!ELEMENT grammar ( 

gram, 

help?) 

> 

<!ATTLIST grammar 
type CDATA #required> 
<!ELEMENT gram ANY> 
<!ELEMENT final ANY> 

<!- open content model for element predicate for now —> 
<!— will use an expression syntax a la xpath and augmented 
<!- as needed -> 

<!— wiU also draw on xforms work — > 

<!ELEMENT predicate ANY> 

<!ELEMENT choice ( 

grammar?, 

PCDATA) 

> 

<!ATTL1ST choice %coreattr, 
value CDATA #required 
> 

<!— default has same content model as choice — > 
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<!ELEMENT default ( 



PCDATA) 
> 

<!ATTLIST default %coreattr, 
value G!>ATA #required 
> 

<!ELEMENT choices ( 

choice+, 

default?) 

> 

<!ELEMENT select ( 

message, 

help?, 

choices, 

predicate?, 

error?) 

> 

<!ATTLIST select %coreattr; 
requirejjredicate (true | false ) #implied 
selection_type CDATA #implied 
> 

<!- {gesture menu 

<!ELEMENT menu ( 

message, 

help?, 

choices) 

> 

<! ATTLIST menu %coreattr, > 

<!- } -> 

<!— {constrained input -> 

<!— CML provides gestures for standard dialog components, 

the following is merely a sample list of gestures: 

Date 

Specify date 
Time 

Specify time. 
Cufrency 

Specify currency amount. 
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Credit card 

Specify a credit card (including card type, card number and 

expiration date). 

Phone 

Specify a telephone numbo^. 
Email 

Specify an email address, 
url 

Specify a uri. 
Snail Adress 

Specify a snail mail address, including street, city/state/country 
and zip code. 

We will specify formal DTD for these elements; -> 
<!- } -> 

<!- {unconstrained input -> 

<!ELEMENT input ( 

message, 

help?, 

predicate?) 

> 

<! ATTLIST input %coreattr; 
require^predicate (true j hist ) #implied 
> 

<!-.}^> 

<!— {gesture userjdentification 

<!ELEMENT userjdentification ( 

message, 

help?, 

user, 

identify, 

predicate, 

error) 

> 

<!ATTLIST userjdentification %coreattr; 
require_predicate (true | false ) #implied 
on_fail CDATA #implied 
> 

<!- } -> 

<!— {gesture submit — > 
<!ELEMENT cnv EMPTY> 
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<!ATTLlSTenv 

name CDATA #required> 

<!ELEMENT submit ( 

message?, 

help?, 

env*) 

> 

<!ATTLIST submit 
target CDATA #required> 

<!~ } ^> 

<!— {binding events — > 
<!ELENfENT bind^ent EMPTY> 
<!ATTL1ST bind-cvent 
logical CDATA #required 
physical CDATA #implied 
modality CDATA #implied 
> 

<!-}-> 

<!— {environment 

<!ELEMENT var EMPTY> 

<!ATTLIST var 

name CDATA #required 

value CDATA #impiied 

> 

<!ELEMENT value EMPTY> 

<!ATTLIST var 

name CDATA #requir6d 

> 

<!ELEMENT assign EMPTY> 

<!ATTL1ST var 

name CDATA #required 

value CDATA #required 

> 

<!-. } 

{ end of file ^ 
<!-End Of DTD 
local variables: 
folded-file: t 
end: 
— > 



<!- } -> 

[0147] Accordingly, the conversational markup language according to the present invention, as described in detail 
herein, provides many useful features and advantages. Programming by interaction pennKs the definition of the un- 
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dertying data model being populated (model) to be separated from the markup language defining the user interaction 
(view/controller). This makes possible the construction of tightly synchronized mutti-modal interactions and supports 
conversational applications. CML according to invention provides mechanisms to support tight syr^ronization, e.g., 
a Node.id attribute attached to each gesture and mapping of this attribute over to the various outputs. The language 

5 is preferably defined in tenms of atomic constructs (the gestures), more complex constructs, if any. are composed of 
these atomic gestures along with a dear semantic definition of the complex construct (in tenns of dialogs). This enables 
mapping the complex modules to different modalities. Voice is considered as a first class user interface (Ul) modality 
at the same level of GUI. Gestures corresponds to elementary dialog components (this includes adding appropriate 
data files). Where required, authors wishing to encapsulate modality-specific components may provide a "pass through" 

10 mechanism for encoding modality-specific markup. Modality specific constructs (either for speech or GUI) may be 
limited to this pass-through mechanism. Conversational Ul is supported. The maricup language captures dialog com- 
ponents that may be active in parallel. CML is an extensible language, e.g., new gestures can be defined, gesture 
transformation rules can be modified, tags/constructs from other languages can t>e embedded (in pass through mode). 
Modality specific tags/pass through is the only mechanism for additional cosmetization of a page. CML also provides 

IS an explicit environment for encapsulating application state. CML further provides the ability for the interaction descrip- 
tion to refer to dynamically generated data, as well as supporting callback mechanisms to the backend. Any conven- 
tional method can be used for these purposes. Further, given the detailed description of CML provided herein, various 
tools and development environments associated with use of the inventive markup language may t>e realized by those 
skilled in the ari. 

20 

U. MULTIMODAL BROWSER 

[0148] The following is a description of a multimodal browser according to the present invention. This section Is 
divided into the following subsections for ease of reference: (A) introduction; (B) Multimodal Shell; (C) Multimodal Shell 
25 and CML; (D) CML and Multimodal Synchronization; (E) CML and Application Authoring; (F) Illustrative Embodiments; 
(G) Alternative Embodiments. 

A. Introduction 

30 [0149] Before describing multi-modal k)rowsing according to the present invention, the following is a summary de- 
scription of some of the above-referenced patent applications with concepts relating to CML and the multi-modal brows- 
er of the present invention. For ease of reference, the related applications are referred to via their respecth^e attorney 
docket numbers. 

[0150] Y0999-111 discloses the concepts of: conversational computing, conversational user interface, and conver- 
ts sational application platform (CVM - Conversational Virtual Machine). The functionalities and behavior/services de- 
scribed in Y0999-111 and provided by CVM can be. in practice, implemented by the multi-modal browser of the inverv 
tion, or by applications which offer a conversational user interface. However, at a conceptual level, it is assumed that 
CVM implements all the necessary services to support the browser of the Invention. 

[0151] Y0998-392 discloses the use of a declarative programming language (referred to as "CML" but which is 
^ different then the language of the invention) to program a conversational application (i.e.. multi-modal). The Y0998-392 
language is a declarative language that supports the multi-modal/conversational user interface, in practice, the exam- 
ple/embodiment provided therein consists of ML pages written according to the "multiple authoring" model instead of 
single authoring as provided for in accordance with the present Invention. Different examples of the declarative pro- 
gramming language where taught 

45 

(i) the speech only ML, also called SpeechML which led to VoiceXML; 

(ii) Multiple files (HTML and VoiceMXL or WML and VoiceXML) with synchronization tags between the files; 

(ill) Single files with multiple modality descriptions (e.g., <MM><Speech>Speech rendering info </speech> <GUI> 
GUI rendering info </GUI></MM> etc.), again with synchronization info; 
so (iv) Single file with frame-like model to split the information associated with different modalities (e.g., the speech 

content is presented in a "speech frame" in addition to the HTML page). 

[0152] None of these items address single authoring. Nor do they address supporting from CML, any target legacy 
ML (channel), or the concept of gesture or gesture-based XSL. 
55 [0153] Y0999-178 describes a generic multi-modal shell. It describes how to support and program synchronized 
multi-modal applications (that they be declarative, imperative or hybrid). It uses registration tables where a each ap- 
plication modality registers its state, the commands that it supports and the impact of these commands on the other 
modality. Again, no teaching of gestures and single authoring. An embodiment describes the architecture when the 
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application is a browser (i.e., a browser associated to the rendering of each nrtodality) and the shell receives a CML 
page (as defined in Y0998-392). builds the registration tables and therefore synchronizes across the modalities. 
[01 54] Now, as will be explained in the following description, the present invention provides for a multimodal browser 
architecture. Such a multimodal browser, as will be described below, makes use of the features and advantages of 

s CML and the conversational gestures of the language, as described above in detail in Section I. to perniit a user to 
access infbnnation in any modality and on any device supported by the application. For example, visual and spoken 
interaction with the multimodal browser is abstracted using a core set of conversational gestures and represented 
using CML Conversational gestures are realized appropriately by each interaction modality. Llght-^ight information 
applications (rnfoware) may be authored using these basic conversational gestures, and the resulting content when 

10 rendered is projected to a modality/device specific markup language or wire protocol, e.g., VoiceXML, WML, to name 
a few. 

B. Muttfmodal Shell 

IS [01 55] At the center of operation of the multimodal browser Is a multimodal shell mechanism. The multimodal shell 
acts as a server to multiple user interface clients or browsers. Browsers providing different interaction modalities, e. 
g.. a visual HTML browser or an auditory VoiceXML browser, register as clients with the multimodal shell. User inter- 
action proceeds by the multimodal shell traversing the CML document During this traversal, the shell orchestrates the 
user's interaction with specific pieces of CML infoware by: 

20 

(i) Initiating user interaction by passing out an interaction-specific representation of the current CML node to all 
registered clients. 

(ii) Waiting for an Information update from all registered clients that have received the current CML node. 

(iii) Possibly resolve conflicts between received information, e.g. the user speaks right and points to the left. 
25 (iv) Updates the current CML node based on the information update just received. 

(v) Upon successfully executing an update, the shell passes the newly updated application state to all registered 
browsers. 

C. Multimodal Shell And CML 

30 

[0156] As explained above, a CML application is an aggregation of a set of standard conversatbnal gestures. Such 
conversational gestures form the basic building blocks of the complete dialog which makes up applications. For ex- 
ample, in a particular application, the primary task of the application designer is to specify: 

3s (j) Specify the items of information to collect from the user. 

(ii) For each requisite item, specify the constraints, e.g., select from a set. etc. 

(iii) Update the application state as each item of information is furnished. 

(iv) Package up tiie collected items of information and submit it to a back-end application server. 

^ [0157] Notice that as specified, the tasks above are independent of the interaction modality in use. 

[0158] Different user interface front-ends, e.g.. a visual WWW browser, an auditory VoiceXML browser, eta. map 
ti^ese tasks to appropriate user interface widgets. 

[0159] CML documents are hosted by a generic multimodal shell. The shell serves different user interface realiza- 
tions, e.g., a visual HTML browser, or an auditory VoiceXML browser. Browsers that wish to be clients of the shell hok] 
45 a weak reference to the current application state. Registered clients are notified by the shell when the applicatton state 
changes; each client then queries Its own weak reference to the application state to extract the relevant information 
that it wishes to present to the user. 

[01 60] The user traverses the CML document by interacting with the application via one of the registered browsers. 
As user interaction proceeds, all registered browsers are notified about tine current CML node that is the focus of 
50 interaction, and consequently update their presentation as needed. The shell keeps track of the cun-entiy open CML 
documents, as well as their con-esponding application states. Where required, the conversational shell can provide 
succinct summaries of the state of any of the cun^ntly open applications. Information submitted via any one of the 
registered clients is mediated by the shell, which takes care of notifying other registered clients and, where necessary, 
tiie back-end application sender. 

55 

D. CML And Multimodal Synchronization 

[0161] Synthesizing the interaction-specific realizations of an application from a single CML representation enables 
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us to synchronize the different aspects of the muttlniodal interface. Each node in the CML representation is tagged 
with a specific node-id. When the CML representation is mapped to an interaction-specific representation, e.g., HTML 
or VoiceXML, nodes in the resulting mapping are tagged with the node-id of their corresponding node in the CML 
representation. When the user interacts with the browser via a specific modality, the multimodal shell maps the currently 

s active nodes in the application bade to the original CML representation by looking up the relevant node-id. As application 
state changes due to user interaction, the sheO passes the modified application state along with the node-id of the 
modified node to all clients that have registered to be notified. Notified applications update the corresponding nodes 
in their interaction-specific representation by checking against the node-id. Notice that registered applications essen- 
tially need to hold a weak reference to the undertying application state. As the interaction-specific rendering engine 

10 updates the necessary nodes, the weak reference will cause the information relevant for the update (and nothing but 
the required Information) to be automatically retrieved from the shell. 

[0162] Referring now to FIG. 9. a new interpretation of the MVC model is shown. In accordance with the new inter- 
pretatton. the model is the CML description of the interaction. The view is the result of applying the gesture-based XSL 
transformation rules to generate the different target ML that are rendered (views) In the different rendering browsers. 
IS The browser offer through the interaction with the user to control the model (and modify its state when a I/O event 
occurs in one of the rendering browser). In accordance with FIG. 9, imagine that VO is the GUI view (e.g., HTML) and 
V1 is the speech view (with natural language or not). CO is the mono-modal HTML browser only control/interaction. 
C1 is the synchronized muttiHfnodal view. C2 is the mono-modal speech control. This approach is fundamentally a new 
paradigm. 

20 

E. CML And Application Authoring 

[01 63] Application creators may interact with a WYSIWYG (what you see is what you get) authoring tool to produce 
CML representations of their application. Applications represented in CML are mapped to an interactiorvspecific rep- 

25 resentation, e.g., VoiceXML or HTML using a standard set of style transformations. Where required, user interface 
designers may create custom style transformations to design a specific look and feel or sound and feel. CML authoring 
tools may also t>e created that allow clients to map legacy HTML-only WWW applications to CML for deployment on 
the multimodal browser platform. Such a tool provides the necessary bridge to help customers deploy existing WWW 
applications on the VoiceXML platfomn; this solution Is mora attractive than directly re-authoring to VoiceXML. since 

30 mapping existing applications to CML once enables deployment across a variety of multinruxSal browser settings. This 
is true also for HTML. WML (and other legacy ML). 

F. Illustrative Embodiments 

35 [01 64] Referring now to FIGs. 1 0-1 2. a migration road map from existing systems to full use of CML in a multimodal 
browsing environment according to the present invention is shown. 

[0165] FIG. 10 illustrates the current fat client web programming model. Content is mostiy written in HTML (statically 
stored in that fbmriat or dynamically generated). When the content needs to be adapted to a particular browser (e.g.. 
a given version of Internet explorer or Communicator), specific style sheets that are a function of the target browser, 

<o as well as the type of content, are built. This is usually a XMUXSL authoring approach. If another channel/modality 
(WML, CHTML, VoiceXML, etc.) is required, the content must be re-written or the content, when written in HTML or 
XML. needs to follow very specific rules and be of a type/domain welt known so that some generic application/business 
logic dependent XSL rules can be used to produce ttiese modality specific legacy languages and/or the XSL rules 
must be re-authored very often. This leads to a plettiora of multiple authoring, that it be directly in the different legacy 

45 languages or that It be In different style sheets that transform a single XML content into these different legacy MLs. 
Eventually, today, there is more and more need for access to the Web (i.e., mostly by exchanging HTML), wireless 
networtc (mostly WML, but other standards exist) and telephone (mostly VoiceXML). Because multiple authoring is the 
only solution, the sites that offers such type of services usually are only closed sites (limited amount of services/content 
- by opposition to the open full web content) with limited amount of service/content providers or enterprise sites. There 

so is no existing solution to offer access to any information, anywhere, at any time through any access device and let the 
user manipulate it The different legacy languages (including XML) do not contain the necessary infonmation to appro- 
priately handle different parts of the page in other modalities (e.g.. the grammars and other arguments for the conver- 
sational engines are missing, etc.). 

[0166] FIG. 11 describes the first step to deploy CML and use the programming by interaction programming model 
55 and conversational computing paradigm. This solution can use toda/s existing infrastructure in terms of the transport 
protocols and networtc (e.g., telephony PSTN, wireless networks (voice and/or data), voice over IP. TCP/IP-HTTP, WAP. 
etc.) and legacy browsers (e.g., HTML browser, WML browser, VoiceXML browser etc.). If content is available In CML. 
it can be transcoded. on the fly. to the target legacy ML supported by the requesting browser whenever a page is 
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served, whetr^r it be statically or dynamically generated. Oetennlnation of the target ML is t>ased on the type of browser 
or IP of the gateway, browser, server a WAP gateway receives WML pages, a browser describes its requirement based 
on descriptors (in http headers) or the access mechanism (e.g., http would imply HTML - at least at the beginning of 
the deployment, until some CML browsers are available). The detemnlnation can also be made depending on the 
requested page: if the browser asks from a xxxx.html, it means that CML is transcoded into HTML. If it asks for yyyy. 
vxml. it means that it is transcoded Into a VbiceXML. etc. Clearly, this guarantees support of the current infrastructure 
and any of its future evolutions. 

[0167] When a CML browser (i.e., conversation/multi-modal) is released. It will request CML pages (i.e.. zzzz.cml) 
and can also describe itself as CML browser. In such case, the pages are served without any transcoding. This guar- 
antees smooth transition from legacy/today's infrastructure to a CMUconversational dominated web programming par- 
adigm. Now, legacy content, (i.e.. static or dynamic content written in HTML. WML. VoiceXML and/or other legacy 
languages) needs to be transformed in CML. Tools can be used at best to "guess" the CML target that then needs to 
be verified and re-edited manually. However, for the same reasons as explained above, a viable automatic transcoding 
system can be used when the original pages have been built according to specific rules, or when the XML tags are 
well defined (domain specific) so that their rote in the page is well defined. 

[0168] FIG. 12 shows the next step in the deployment road map. when CML conversational (multi-modal browser) 
become the norm. Accordingly, the transcoding is now part of the browser, the pages are authored and served in CML. 
When a legacy (i.e., non-CML) page is provided, it is fetched by the multi-modal shell but then it will be directly trans- 
mitted to the corresponding rendering browser that handles the corresponding modality. 

[0169] CML content and legacy content of course still needs to be authored or transfonned to CML as described 
above. 

[0170] Refening now to FIG. 13. a block diagram is shown of a multimodal browser architecture according to the 
present invention. As shown, a multimodal browser 60 comprises a mutlmodal or conversational shell 62. a GUI ren- 
dering browser component 64 and a speech rendering browser component 66. The mutlmodal shell is also referred to 
as a "Virtual browser." It is to be understood that while the multimodal browser 60 depicts the use of two modalities; 
vision (browser component 64) and speech (browser component 66), the invention Is not limited to these modalities. 
The multimodal browser 60 operates generally as follows. A user desiring to access an application Interfaces with a 
client device (e.g., personal computer, laptop computer, personal digital assistant, etc.) on which all or portions of the 
multimodal browser resides. In the general case shown In FIG. 13, the user can do this via a textual and/or graphic 
interface (GUI input/output), and/or the interface can be via speech (audio input/ouput). While FIG. 13 Illustrates the 
multimodal browser 60 in one block. It will be explained betow that the multimodal browser may be implemented over 
multiple devices, including both client and sender computer systems. 

[0171] Based on the user's request, the multimodal browser 60 sends an appropriate URL to a content server 69. 
which also services conversational engines 68 that may also reside on the client device, in order to request access to 
the particular desired application. CML code associated with the application is then downloaded from the content server 
69 to the multimodal browser 60. The multimodal browser then generates the modality specific renderings (GUI rep- 
resentation and/or speech representation) based on the conversational gestures associated with the CML code. The 
user thus interacts with the browser 60 via these representations. 

[0172] Referring now to FIG. 14 (with continued reference to FIG. 13). a more detailed flow diagram is shown illus- 
trating the operation of a multimodal browser according to one embodiment of the Invention. An application developer 
writes an application, e.g.. a light-weight application referred to as infoware, in CML. Infoware authored in CML Is 
hosted by the a conversational shell (e.g.. multimodal shell 62 of FIG. 13) that mediates amongst multiple modality 
specific browser components (e.g., visual browser 64 and speech browser 66 of FIG. 13). The multimodal shell may 
be thought of as a CML interpreter or processor. This is illustrated in FIG. 14 as block 70. User interaction proceeds 
by the CML interpreter mapping CML instances associated with the downloaded CML code to appropriate modality- 
spedflc languages such as HTML (block 77) and VoiceXML (block 78). These modality-specific representations render 
modality-specific versions of the dialog associated with the application. As illustrated in block 70, the nodes (A) and 
arrows (B) represent the declarative program in CML. The gestures in the CML program are represented by each of 
the nodes and the anows represent the flow of the interaction/dialog with possible bifurcation points or loops. Each 
gesture is Identified by a node ID (node_id) that allows appropriate identification of the activated gesture for synchro- 
nization between the different registered modalities. The node^id identifies the gesture so that the CML browser (i.e., 
the multimodal shell or virtual browser) knows where tt Is In the dialog flow and where to go from there (e.g.. update 
the different modalities or send variables to the server and fetch a new CML page). 

[0173] The transfomiation from CML to modality-specific representations 77 and 78 is govemed by XSL transforma- 
tion mles (or other transformation mechanisms, as mentioned above). These XSL rules are modality-specific. These 
transformations are handled by the presentation generation block 72 in accordance with the XSL rules 74 and the 
registration table 76. The registration table 76 is a repository of default gesture XSL transfomiation rules, as well as 
the specific rules that are extensions, application specific, device specific or user specific. In the process of mapping 
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the CML instance to an appropriate modality-spedfic representation, the XSL rules add the necessary Information 
needed to realize modality-specific user interaction. As an example, when translating element select to VoiceXML. the 
relevant XSL transformation rule handles the generation of the grammar that covers the valid choices for that conver- 
sational gesture. 

5 [0174] The process of transforming CML instances to modality>specific representations such as HTML may result 
In a single CML node mapping to a collection of nodes in the output representation. To help synchronize across these 
various representations. CML attribute node^ld is applied to all output nodes resulting from a given CML node. When 
a given CML instance Is mapped to different representations, e.g.. HTML and VoiceXML by the appropriate modality- 
specific XSL rules, the shape of the tree in the output is likely to vary amongst the various modalities. However, attribute 

10 node.id allows us to synchronize amongst these representations by providing a conceptual baddink from each mo- 
dality-specific representation to the originating CML node. This is graphically depicted in block 70 of FIG. 14. 
[0175] As user interaction proceeds, variables defined in the environment by the cun^nt CML instance get bound to 
validated values. This binding happens first in one of the modality-specific representations (registered clients) 77 and 
76. The modality-specific representation sends an appropriate message to the CML interpreter (multimodal shell) conv 

is prising of the updated environment and the node.id of the gesture that was just completed. Once the updated binding 
has been propagated to the CML interpreter, it messages all modality-spedfic representations with the node.id of the 
gesture just completed. Modality-specific representations update their presentation upon receiving this message by 
first querying the CML interpreter for the portion of the environment that affects their presentation. 
[0176] FIG. 1 5 illustrates the different steps performed by a CML multi-modal browser according to one embodiment 

20 of the present Invention. When a CML page is fetched by the browser, the browser parses the CML content, e.g., similar 
in operation to an XML parser (step 90). The browser builds an internal representation of the interaction (i.e., the graph/ 
tree of the different gestures described in the page) and the node-id. Using the gesture XSL transfonmation (or other 
transformation mechanisms like Java Beans or Java Server Pages) stored In the browser (block 98), it builds (step 96) 
the different ML pages sent to each rendering browser (block 100). Upon I/O events in a modality, the effect Is examined 

25 (step 92) at the level of the interaction graph (I.e., as stored in the MM shell Registration table (block 94) as described 
in Y0999-178). Note that the gestures XSL transformation rules can be overwritten by the application developer indi- 
cating where they should be downloaded. They can also be overwritten by user, application or device preference from 
what would be otherwise the default behavior. New gestures can also be added, in which case, the associated XSL 
rules must be provided (e.g.. a URL where to get them). 

30 [0177] As previously mentioned, the present invention provides for a multi-device or distributed browsing environ- 
ment. Due to the nature of CML and its ability to the effectively synchronize multiple browsers, various portions of an 
application nrtay reside and be executed on separate computing devices. A user may then simultaneously interact with 
more than one device, e.g.. a laptop computer and a cellular phone, when accessing an application. This is actually 
not limited to browsing in different modalities: even in a same modality (e.g., GUI only), the same principle can be used 

35 to describe in advance what are the devices where some content needs to be rendered and to synchronize this rendering 
across modalities: e.g.. diplay of image on one device, video in another and text plus background in a third. Another 
example Is: text and images in one and applets in another etc. Many more examples are easily conceivable. This would 
require using customized gestures or gesture XSL rules. Alternatively, this would require another martc-up (with other 
gestures and default rendering) to do that 

^ [0178] Referring now to FIG. 16, such a distributed browsing environment is illustrated. The functions and operations 
of the multimodal browser 62, the visual browser 64 , the speech browser 66, the conversational engines 68 and the 
content server 69 are the same as described above with respect to FIGs. 13 and 14. However, as can be seen, the 
components are distributed on multiple computing devices. For example, the multimodal browser 62 resides on a server 
80, the visual browser 64 resides on a client device 82, and the speech browser resides on a server 84. These client 

45 and server devices riiay be in communlcatton via the WWW. a local networic, or some other suitable networic. The user 
may be local to the client device 82, while the servers 80 and 82 are remotely located. Alternatively, all or some of the 
computing systems may be collocated. Since the user interacts directly with the client device 82, audio input/output 
facilities 86 (e.g., microphone and speaker) are provided at the device 82, which are connected to the speech browser 
at the sen/er 84. As can be seen, the same synchronized operation of a CML application may be accomplished even 

50 though the various components of the multimodal browser are located on separate computing devices. 

[01 79] It is to be appreciated that each client device and server described above for implementing the methodologies 
of the present invention may comprise a processor operatively coupled to memory and I/O devices. It is to be appre- 
ciated that the temn "processor" as used herein is intended to include any processing device, such as. for example, 
one that Includes a CPU (central processing unit). The term "memory" as used herein is intended to include memory 

55 associated with a processor or CPU, such as, for example, RAM, ROM, a fixed memory device (e.g., hard drive), a 
removable memory device (e.g., diskette), flash memory, etc. In addition, the term "input/output devices" or "I/O devices" 
as used herein is intended to include, for example, one or more input devices, e.g., keyboard, microphone, etc., for 
inputting data to the processing unit, and/or one or more output devices, e.g., CRT display, a speaker, etc.. for presenting 
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results associated with the processing unit The input/output devices are modality specific and therefore other devices 
may k>e employed. It Is also to be understood that 'processor' may refer to more than one processing device and that 
various elements associated with a processing device may be shared by other processing devices, Accordingty. soft- 
ware components including instructions or code for perfbnning the methodologies of the invention, as described herein, 
may be stored in one or more of the associated memory devices (e.g.. ROM, fixed or removable memory) and. when 
ready to be utilized, loaded In part or In whole (e.g.. into RAM) and executed by a CPU. 

6. Altemative Embodiments 

[01 80] Among the possible extensions that trivially result from the teaching of this invention we have the following. 

(i) Mum-device browsing (even in a given modality) as discussed above. 

(ii) Multi-geographic support some gestures (e.g., telephone number, address etc.) can be adapted to the local 
format as well as language. This can be combined with a text-to-text translation system to provide a fully automatic 
localization mechanism (select yes/No, becomes select Oui/Non) trivially through different XSL rules. Altematively, 
in the absence of such an automatic transcoder, the system can be used as part of development/localization tools 
to speed up the localization/intemationalization, geogrephy/region adaptation. 

(ill) Conversational Foundation Class: The conversational foundation classes where introduced in Y0999-111 as 
being imperative dialog components that are independent of the modality and that can run in parallel and In series 
to build more complex dialogs. Combined with the services provided by the conversational application platform 
(CVM - conversational virtual machine), they allow programming of imperative conversational (multi-modal appli- 
cations) by loading/linking to the libraries of these foundation classes that the platform provides. As each CVM 
platforms provides it. the application developer can use them and not wony about the rendering within the modality/ 
modalities supported by the device and their synchronization. Accordingly, each gesture defined declaratively in 
the CML specification provided herein can have a imperative implementation (e.g., in Java) that can run in series 
(one after the other) or In parallel (more than one active - like more than one form active at a time). Programming 
in CFC is equivalent to programming imperatively by interaction: you use and link to the some imperative gesture, 
you hook it to the backend and connect the gesture together by conventional code. You may add some modality 
specific customization in this code or in tiie CFC arguments. Then, you let the platform (CVM or a browser that 
implements \he same level of functionality) handle the rendering within the appropriate modality and appropriate 
synchronization between modality as hard coded in the foundation class. An example would be a case where all 
the foundation classes are provided as Java Classes. This allows extension of tiie programming by interaction 
model to Java applets or servlets, etc. 

(iv) Hybrid programming by interaction is a combination of declarative and imperative: CML pages with calls to 
CFC and other objects built using CFC (and more task specific) e.g., java applets. Therefore, the programming by 
interaction programming mo6e\ is to be considered as generally covering all the programming modes. 
(V) Scripting: CML can support any scripting that we want to re-use (ECMA Script as defined at http://www.ecma. 
ch/stand/ecma-262.htm, etc.) directly as a scripting language of the multi-modal shell. Modality specific scripts 
(like Javascript or WML script) have to be considered as modality specific scripting languages. Although it is pos- 
sible to define today (i.e., for the step where we use today's infrastructure) a more detailed behavior of how an 
ECMA script in CML would be transcoded for legacy browser, they can be simply handled as modality specific (i. 
e., like an image). 

[0181] Although illustrative embodiments of the present invention have been described herein with reference to the 
accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that 
various other changes and modifications may be affected therein by one skilled In the art without departing from the 
scope of tiie Invention. 



Claims 

1 . A method of programming an application accessible by a user through one or more computer-based devices, the 
method comprising the steps of: 

representing interacttons tiiat the user is pennltted to have with the one or more computer-based devices 
used to access the application by interaction-based programming components; 

wherein the interaction-based programming components are independent of content/application logic and 
presentation requirements associated with the application, and further wherein the interaction-based programming 
components are transcoded on a component by component basis to generate one or more modality-specific ren- 
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derings of the application on the one or more computer4>ased devices. 

2. The method of daim 1 . in a dient/server an^ngement wherein at least a portion of the application is to l)e down- 
loaded from a server to at least one of the one or more computer-based devices, acting as a dient. further com- 
prising the step of induding code in the application operative to provide a connection to the content/application 
logic resident at the server. 

3. The method of daim 2, wherein the content/application logic connection code expresses at least one of one or 
more data models, attribute constraints and validation rules assodated with the application. 

4. The method of anyone of daim 1 to 3. wherein the one or more modality-spedfic renderings comprise a speech- 
based representation of portions of the application. 

5. The method of daim 4. wherein the speech-based representation is based on N^iceXML. 

'6. The method of daim 1 , wherein the one or more modality-specific renderings comprise a visual-based represen- 
tation of portions of the application. 

7. The method of daim 6, wherein the visual-based representation is based on at least one of HTML. CHTML and 
WML 

8. The method of anyone of daim 1 to 7. wherein the user interactions are declarativety represented by the interaction- 
based programming components. 

9. The method of anyone of daim 1 to 8. wherein the user interactions are imperatively represented by the Interaction- 
based programming components. 

1 0. The method of anyone of daim 1 to 9, wherein the representation permits reference to dynamically generated data 
and supports callback mechanisms to the content/application logic. 

11. The method of anyone of daim 1 to 10, wherein the interactiorvbased programming components comprise basic 
elements associated with a dialog that may occur between the user and ttie one or more computer-based devices. 

12. The method of claim 11, wherein ttie interadlon-based programming components comprise complex elements, 
the complex elements being aggregations of two or more of the basic elements associated with the dialog that 
may occur between the user and the one or more computer-based devices. 

13. The method of anyone of daim 1 to 12. wherein one of the Interaction-based programming components represent 
conversational gestures. 

14. The metiiod of daim 13, wherein the conversational gestures comprise a gesture for encapsulating informational 

messages to the user. 

15. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating contextual 
help information. 

16. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating adions to be 
taken upon successful completion of another gesture. 

17. The mettiod of daim 13, wherein the conversational gestures comprise a gesture for encapsulating yes or no 
based questions. 

18. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating dialogues ' 
where the user is expected to seled from a set of choices. 

19. The method of daim 18. wherein the select gesture comprises a subelement ttiat represents the set of choices. 

20. The method of claim 18. wherein the seled gesture comprises a subelement ttiat represents a test ttiat the selection 
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should pass. 

21 . The method of claim 20. wherein the select gesture comprises a subelement that represents an error message to 
be presented if the test fails. 

22. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating mies for val- 
idating results of a given conversational gesture. 

23. The method of daim 13, wherein the conversational gestures comprise a gesture for encapsulating grammar 
processing rules. 

24. The method of claim 13. wherein the conversational gestures comprise a gesture for encapsulating dialogues that 
help the user navigate through portions of the application. 

25. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating a request for 
at least one of user login and authentication information. 

26. The method of daim 13, wherein the conversational gestures comprise a gesture for encapsulating a request for 
constrained user input 

27. The method of daim 13. wherein the conversational gestures comprise a gesture for encapsulating a request for 
unconstrained user Input 

28. The method of daim 13, wherein the conversational gestures comprise a gesture for contit>lling submission of 
information. 

29. The method of anyone of daim 1 to 28. further comprising the step of providing a mechanism for defining logical 
input events and tiie assodation between the logical input events and physical input events that trigger the defined 
logical Input events. 

30 

30. The method of anyone of daim 1 to 29, wherein the component by component transcoding is performed in ac- 
cordance with XSt transformation rules. 

31. The method of anyone of claim 1 to 30. wherein the component by component transcoding is perfonmed in ao- 
35 cordance with Java Bean. 

32. The method of anyone of claim 1 to 31, wherein the component by component transcoding is perfomied in ac- 
cordance with Java Server Pages. 

^ 33. The method of anyone of daim 1 to 32. wherein representation by the interaction-based programming components 
penmits synchronization of the one or more modality-spedfic renderings of the application on the one or more 
computer-based devices. 

34. The method of anyone of claim 1 to 33. wherein representation by the the interaction-based programming com- 
<s ponents supports a natural language understanding environment 

35. The method of anyone of daim 1 to 34, further comprising the step of including code for permitting cosmetic altering 
of a presentational feature associated with the one or more modality-spedfic renderings of the application on the 
one or more computer-based devices. 

50 

36. The method of anyone of daim 1 to 35, forther comprising the step of induding code for permitting changes to 
nites for transcoding on a component by component basis to generate the one or more modality-spedfic renderings 
of the application on the one or more computer-based devices. 

55 37. The method of anyone of claim 1 to 36, wherein a definition of an underiying data model being populated is sep- 
arated from a markup language defining the user interaction. 

38. The method of anyone of claim 1 to 37. wherein a node.id attribute is attached to each component and the attribute 
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is mapped over to various outputs. 

39. The method of anyone of daim 1 to 38, wherein an author is provided with a pass through mechanism to encap- 
sulate modality-specific markup components. 



40. The method of anyone of daim 1 to 39. wherein the components may be active in parallel. 

41. The method of anyone of daim 1 to 40, wherein the representation and transcoding is extensible. 
10 42. The method of anyone of daim 1 to 41. wherein a state of the application is encapsulated. 

43. Apparatus for use in accessing an application in assodation with one or more computer-based devices, the ap- 
paratus comprising: 

one or more processors operative to: (i) obtain the application from an application server, the application 
IS being programmatically represented by Interactions that the user is permitted to have with the one or more com- 

puter-based devices by interaction-based programming components, wherein the Interaction-based programming 
components are Independent of content^application logic and presentation requirements assodated with the ap- 
plication; and (ii) transcodethe interaction-based programming components on a component by component basis 
to generate one or more modality-specific renderings of the application on the one or more computer-based de- 
20 vices. 

44. The apparatus of daim 43, wherein the one or more processors are distributed over the one or more computer- 
based devices. 
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<?xml version="1.0"?> 

<!~$id: drink.cinl.¥ 1.3 1999/11/12 22:36:21 $— > 



GESTURE: ^ 
TITLE ^ 

20^ 



GESTURE: ^ 
MESSAGE 



22 



<cml node^id="0" 
namc="drink" 
title="Global Cofe" 
action=*'http:localhosl/servlel/drink"> 
<select nome="drink" 

nodeJd="r'> 
<messQge node.id="2"> 
Would you like coffee, tea» milk, or nothing?^ 
</message> 
<choices node_id="3"> 
<default value="coffee*'>Coffee</defau!t> 
<cholce VGlue="tea">Teo</choice> 
<chpice value="milk">Milk</choice> 



<choice value="nothing">NoUiing</choice> 4 



L 

^ EXCLUSIVE 
^ SELECTION 

I 



</choices> 
</selecl> 
</cml> 
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FIG. 4 
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FIG. 6A 
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FIG. 6B 
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FIG. 6C 
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FIG. 8 
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FIG, 9 
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FIG. 11 



SERVER 



MULTI-UODAL 
BROWSER 



HTML 
BROWSER 


HTML I 






CML TO 
HTML 




WML ! 


WML 
BROWSER 






CML TO 
WML 




VoiceXML! 


VoiceXML 
BROWSER 






CMLTO 
VoiceXML 







CML 
CONTENT 




TRANSCODING 




HTML 


EOniNG 




CONTENT 


OF LEGACY 




+ XML, 


MLs AND 




WML. 


APPLICATIONS 




VoiceXML 






ETC.. 



LEGACY 
BROWSERS 



NETWORK 



GESTURE TRANSCODING 
WHEN PAGE ARE FCTCHED 



FIG. 12 



GESTURE TRANSCODING WHEN 
GESTURE IS ACTIVATED 



SAME CONTENT AND 
BACKEND LOGIC 




TRANSCODING 




HTML 


EDITING 




CONTENT 


OF LEGACY 




+ XML, 


MLs AND 




WML. 


APPLICAHONS 




VoiceXML 






ETC.. 



101 



EP1 100 013 A2 



GUI I/O 



HTML OR WML 



FIG. 13 

AUDIO I/O 



VISUAL BROWSER: 
GUI RENDERING 
BROWSER 



62 



64 



66- 



SPEECH BROWSER: 
SPEECH RENDERING 
BROWSER 



MULTI-MODAL SHELL 
VIRTUAL BROWSER 



I MULH-MODAL BROWSER 
60^ 



DISTRIBUTED 

CONVERSATIONAL 

PROTOCOLS 



68 



CONVERSAHONAL 
ENGINES 



URL 



http 



CML 



♦ 



CONTENT 
SERVER 



inputCML 


WO 


DIALOG-BASED 


PROGRAMMING 


A— ^ 




B- — 


Node id 


i 


\ 






< 




SYNCHRONQAHON 


DATA 



FIG. 
JJL. 



14 



PRESENTAnON 
GENERATION 




Nodeid_?.html 

The html page could be 
XHML or WML 

MODALITY 
SPECfflC 
RENDERING 



NodeliL?.VoiceXML / 



78 



MODALITY 
SPECmC 
RENDERING 



XSL 



■74 



REGISTRAnON 
TABLE 



76 



102 



EP1 100 013A2 



FIG. 15 
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